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Abstract 

The framework of consistent query answers and repairs has been introduced to 
alleviate the impact of inconsistent data on the answers to a query. A repair is a 
minimally different consistent instance and an answer is consistent if it is present 
in every repair. In this article we study the complexity of consistent query answers 
and repair checking in the presence of universal constraints. 

We propose an extended version of the conflict hypergraph which allows to capture 
all repairs w.r.t. a set of universal constraints. We show that repair checking is in 
PTIME for the class of full tuple-generating dependencies and denial constraints, 
and we present a polynomial repair algorithm. This algorithm is sound, i.e. always 
produces a repair, but also complete, i.e. every repair can be constructed. Next, 
we present a polynomial-time algorithm computing consistent answers to ground 
quantifier-free queries in the presence of denial constraints, join dependencies, and 
acyclic full-tuple generating dependencies. Finally, we show that extending the class 
of constraints leads to intractability. For arbitrary full tuple-generating dependen- 
cies consistent query answering becomes coNP-complete. For arbitrary universal 
constraints consistent query answering is n^-complete and repair checking coNP- 
complete. 
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1 Introduction 



Traditionally, the consistency of a database with a set of integrity constraints 
was maintained by a DBMS [35]. While integrity constraints continue to ex- 
press important properties of the stored data, in many novel database appli- 
cations enforcing the consistency becomes problematic. For example, in the 
scenario of data integration even if the sources are separately consistent, to- 
gether they may contribute conflicting data. Because data sources are often 
autonomous and their contents cannot be altered, the consistency cannot be 
restored by means of data manipulation. Consistency violations occur natu- 
rally also in the context of long running data manipulations, delayed updates 
on data warehouses, and legacy databases. Finally, consistency enforcement 
may be deactivated for efficiency reasons. At the same time, the semantic 
properties expressed by integrity constraints often influence the way the user 
formulates her queries. Hence, if the database is inconsistent, evaluating the 
queries may yield incorrect and misleading answers. 

To address the problem of the potential impact of inconsistencies on query 
results Arenas et al. have proposed the framework of repairs and consistent 
query answers [3]. A repair is a consistent database instance minimally dif- 
ferent from the original one. The consistent query answers are the answers 
present in every repair. Intuitively, the repairs represent all possible ways to 
restore consistency in the database and an answer is consistent if it is obtained 
regardless of the way the conflicts are resolved, i.e. the answer that is not af- 
fected by the inconsistencies. This framework has served as a foundation for 
most of the subsequent work in the area of querying inconsistent databases 
(for the surveys of the area, see [11,9,15,14,22]). 

Example 1 We consider a database that stores information on the occurrence 
of a genetically inherited disease neurofibromatosis (NF) causing tumors of the 
nervous tissue. NF is an autosomal dominant disorder, which means that only 
one mutated gene needs to he present in the genome of an affected person. 
Typically, this gene is inherited from one of the parents. ^ 

The schema of the database contains two relations: NFj Name , Diag) and 
Parent{Name, Child), where the underline indicates the (primary) key of a 
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relation. The following constraint captures the inheritance factor of NF: 

NF{x, 'yes') A Parent{yi, x) A Parent{y2, x) Ayi y2 

=^NF{yi,'yes')yNF{y2/yes'). (1) 



Now, consider the database instance I in Figure 1. This instance violates (1); 

NF 

Parent 



Name 


Diag 


Steve 


no 


Mary 


no 


Donald 


yes 



Name 


Child 


Steve 


Donald 


Mary 


Donald 



Fig. 1. Inconsistent database /i. 

Donald is diagnosed with NF while neither of his parents are. This violation 
can he resolved in three ways: 

(1) By inserting a tuple with a positive diagnosis for one of the Donald 's 
parents. Because of the key dependency, this creates a conflict with the 
already existing tuple which is consequently deleted. This yields the fol- 
lowing repairs: 

I[ = {N F {Steve, yes), NF {Mary, no), NF {Donald, yes), 
Parent{Steve, Donald), Parent{Mary, Donald)}. 

I2 = {NF{Steve,no),NF {Mary, yes), NF{Donald,yes), 
Parent{Steve, Donald), Par ent{Mary, Donald)}. 

(2) By removing one of the tuples of Parent relation, which gives the follow- 
ing repairs: 

/g = {N F {Steve, no), NF {Mary, no), NF {Donald, yes), 

Parent{Mary, Donald)}. 
I'^ = {NF {Steve, no), NF{Mary, yes), NF {Donald, yes), 

Parent{Steve, Donald)}. 

(3) By removing the tuple with the diagnosis of Donald giving the following 
repair: 

I'^ = {NF{Steve,no),NF{Mary,no), 

Parent{Steve, Donald), Parent{Mary, Donald)}. 
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Consider now the query NF{Steve, no) asking if Steve is not diagnosed with 
NF. The answer to this query in the instance I is true. However, true is not 
the consistent query answer because of the repair Ii (which indicates that the 
diagnosis of Steve may he incorrect). 

We note that the framework of consistent query answers is parametrized by 
the notion of minimaUty used to define repairs. The original notion uses the 
symmetric set difference (between databases as sets of tuples) and set inclu- 
sion, i.e. the repairs are obtained by deleting and inserting a minimal set of 
tuples. This notion is most commonly considered in the literature and it is 
used in this paper. Other investigated notions of minimality use asymmet- 
ric set difference [13] and the cardinality of the symmetric difference [4,31]. 
Finally, various notions of minimality have been considered to accommodate 
repairs obtained by attribute value modification [31,10,30,12,39]. 

It was observed very early that the number of possible repairs may be expo- 
nential even if we consider one functional dependency [5]. A naive approach 
to compute consistent query answers by materializing all repairs and conse- 
quently evaluating the query in every repair may thus be simply impractical. 
Consequently, to establish the tractability of database repairing and comput- 
ing consistent query answers two fundamental decision problems have been 
investigated: (i) repair checking - checking if a given database instance is a 
repair, and (ii) consistent query answering - checking if an answer to a query 
is present in every repair. Most of the research in this area uses the notion of 
data complexity., commonly used to study tractability of computing answers 
in relational databases [38]. It allows to express the complexity of the prob- 
lems in terms of the database size only: the set of integrity constraints and 
the query arc assumed to be fixed. We note that the study of complexity of 
the two decision problems is motivated by the belief that tractable decision 
algorithms can be converted into efficient algorithms that compute consistent 
query answers and construct repair(s) of an inconsistent database. This be- 
lief is validated, for example, by existing polynomial-time algorithms where 
decision problems play a central role in the computation of consistent query 
answers [17,24]. 

The problems of repair checking and consistent query answering are param- 
eterized by the class of integrity constraints. Denial constraints allow the 
user to specify sets of tuples that cannot be simultaneously present in the 
database because they create a confiict. This class of constraints includes 
equality- generating dependencies., thus also functional dependencies, and ex- 
clusion constraints [1] . The standard definition of denial constraints allows to 
use conjunctions of = and ^ comparisons to relate the values of the tuples cre- 
ating conflicts. A more general version of denial constraints allows using any 
Boolean combination of formulas using =, 7^, <, <, >, and > [8]. This version 
has also been studied in the context of consistent query answers [15]. There, 
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the authors proposed the conflict hypergraph to store all conflicts present in the 
database and subsequently use it to construct repairs and efficiently compute 
consistent query answers. 

Universal constraints generalize denial constraints by allowing to express con- 
flicts created not only by the presence of some tuples but also by simultaneous 
absence of other tuples. This class of constraints contains full tuple- generating 
dependencies (full TGDs) which have been thoroughly studied in the con- 
text of relational databases [1,32,33]. Full TGDs contain an important class 
of join dependencies (JDs), and thus also its subclass multi-valued dependen- 
cies (MVDs), which are frequently used in the in the setting of denormalized 
databases [1,35]. In this context, the constraints are typically not actively en- 
forced which permits the occurrence of insertion/deletion/update anomalies. 

Example 2 Consider a denormalized database that stores the information 

about locations and the offer of different chains of coffee shops. The schema 
is Coff'eeShop{Chain, Location, Beverage) . The list of beverages offered by a 
particular chain is the same in every coffee shop, which is expressed with the 
following join dependency: 

CoffeeShopM: [{Chain, Location}, {Chain, Beverage}]. 

Now, consider the instance in Figure 2. This instance is inconsistent because 



CoffeeShop 



Chain 


Location 


Beverage 


Starbucks 


Delaware Ave. 


Latte 


Starbucks 


Delaware Ave. 


Espresso 


Starbucks 


Main Str. 


Latte 


Spot 


Elmwood Ave. 


Latte 



Fig. 2. Inconsistent instance I2. 



Espresso is offered at Starbucks on Delaware Avenue but not at Starbucks on 
Main Street. This instance has three repairs: 

(1) The first corresponds to the scenario where Espresso has been added to 
the offer of Starbucks but the change has not been propagated properly. 
This repair is obtained by inserting the tuple 

CoffeeShop{Starbucks, Main Str., Espresso). 
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(2) The second corresponds to the scenario where Espresso has been removed 
from the offer of Starbucks but the change has not been propagated prop- 
erly. This repair is obtained by deleting the tuple 

CoffeeShop{Starbucks, Delaware Ave., Espresso). 

(3) The third corresponds to the scenario where the coffee shop located on 
Main Street is being closed. This repair is obtained by deleting the tuple 

CoffeeShop{Starbucks, Main Str., Latte). 

Now, consider the query CojfeeShop{Starbucks, Delaware Ave., Latte) . true 
is the consistent answer to this query because it is true in every repair. If 
we consider the query CoffeeShop{Starbucks, Delaware Ave., Espresso), we 
note that the answer to this query is true in the original answer. We observe, 
however, that true is not the consistent query answer because the query is not 
true in every repair. 

The complexity of repair checking and consistent query answering in the pres- 
ence of general universal constraints has not been thoroughly studied. Previ- 
ous research conducted in this area shows that computing consistent query 
answers is: 

• in PTIME for the class of binary universal constraints and a restricted class 
of quantifier-free queries [3]. 

• in PTIME for the class of denial constraints and quantifier-free queries [15]. 

• in PTIME when at most one primary key per relation is present and the 
queries belong to a restricted class of conjunctive queries C forest [24,23,25]. 

• coNP-complete for primary keys and arbitrary conjunctive queries [15]. 

• Hp-complete for arbitrary sets of functional and inclusion dependencies [15], 
when repairs are constructed using deletions only. 

• undecidable for arbitrary sets of functional and inclusion dependencies [13]. 

Wc remark that the class of universal constraints captures only full inclusion 
dependencies and in the paper we do not consider general inclusion dependen- 
cies. 

In this paper we investigate computing consistent query answers and repairs in 
the presence of universal constraints. This research constitutes a continuation 
and substantial extension of [15]. Similarly to [15] in the constraint definition 
we allow any Boolean combination of atomic formulas that use the binary 
relations: —, ^, <, <, >, and >. We propose an extenc/eo? version of the conflict 
hypergraph whose hypcrcdgcs span both tuples present and absent in the 
database. The size of the extended conflict hypergraph is still polynomial in the 
size of the database and every repair corresponds to a maximal independent 
set. Although, the converse correspondence is not necessarily true, i.e., not 
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every maximal independent set defines a repair, we consider the extended 
conflict hypergraph to be a compact representation of all repairs. 

Next, we study the computational implications of universal constraints. In this 
paper we show that: 

• The complexity of repair checking is: 

— in PTIME for arbitrary full tuple-generating dependencies and denial con- 
straints. Consequently, we present a polynomial database repairing algo- 
rithm that is both sound (always constructing a repair) and complete (able 
to construct every repair). 

— coNP-complete for arbitrary universal constraints. 

• The complexity of consistent query answering is: 

— in PTIME for quantifier-free closed queries in the presence of join depen- 
dencies, acyclic full tuple-generating dependencies, and denial constraints. 

— coNP-complctc for atomic queries in the presence of arbitrary full tuple- 
generating dependencies and denial constraints. 

— Hp-complete for atomic queries in the presence of arbitrary universal con- 
straints. 

The paper is organized as follows. Section 2 contains basic notions and def- 
initions. In Section 3 we present the extended conflict hypergraph, study its 
properties, and investigate basic properties of the framework. In Section 4 we 
study the complexity of repair checking in the presence of full tuple-generating 
dependencies and we present a database repairing algorithm. In Section 5 we 
investigate the complexity of consistent query answering in the presence of 
full tuple-generating dependencies. In Section 6 we investigate the complexity 
of consistent query answering and repair checking in the presence of arbitrary 
universal constraints. In Section 7 we discuss related work. Section 8 contains 
final conclusions and the discussion of future work. 



2 Preliminciries 

A database schema 5 is a set of relation names of fixed arity (greater than 0) 
and we use R, P, . . . to denote relation names. Relation attributes are drawn 
from an infinite set of names U, and we use A, B,C, . . . to denote elements 
of U and X,Y, Z, . . . to denote finite subsets of U. For R E S we denote 
the set of all attributes of R by attrs{R). Every element of U is typed and we 
consider only two disjoint infinite domains: Q (rationals) and D (uninterpreted 
constants). We assume that two constants are equal if and only if they have 
the same name, and we allow the standard built-in relation symbols = and 
7^ over D. We also allow the built-in relation symbols =, 7^, <, <, >, and > 
with their natural interpretation over Q. We use these symbols together with 
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the vocabulary of relational names S to build a first-order language JC. An 
>C-formula is: 

• closed (or a sentence) if it has no free variables, 

• ground if it has no variables whatsoever, 

• quantifier-free if it has no quantifiers, 

• atomic if it has no quantifiers and no Boolean connectives. 

Finally, a fact is an atomic ground >C-formula and a literal is a fact or the 
negation of a fact. 

Database instances are finite, first-order structures over the schema. Often, 
we will find it more convenient to view an instance / as the finite set of all 
facts satisfied by the instance {R{t) \ R E S, I \^ R{t)}- 

In the sequel, we will denote tuples of variables by x,y, . . ., tuples of constants 
by t,s, . . ., facts by p, q,r, . . ., quantifier-free formulas using only built-in pred- 
icates by (f, and instances by /, J, . . . 

2.1 Integrity constraints 

An integrity constraint is an £-scntcncc, i.e. a closed first-order £-formula. In 
this paper we consider the class of universal constraints, /^-sentences of the 
form 

VX. ^[Ri{xi) A ... A Rn{Xn) A -Pl(yi) A ... A ^PmiVm) A ip{x)], (2) 

where (p{x) is a quantifier-free formula referring to built-in relation names only 
and ^1 U . . . Uy^ C xi U . . . Ux„ = .X (this is a standard safety requirement [1]). 
Also, we make a natural assumption that n + m > 0. The constraint (2) will 
be often presented as: 

Rl{xi) A ... A Rn{Xn) A ip{x) ^ Pi(^i) V ... V Pm{ym). (3) 

where all the variables are implicitly universally quantified. 

The class of universal constraints contains the following basic classes of in- 
tegrity constraints: 

(1) Full tuple- generating dependencies (full TGDs): universal constraints with 
one atom in rhs (m = 1). Often, the full TGDs considered in hterature 
have a conjuntion of atoms in rhs. We remark, however, that a multi-head 
full TGD is equivalent to a set of single-head full TGDs [1]. 

(2) Join dependencies ( JDs) commonly formulated as R M: [Xi, . . . , X^] , where 
i? is a relation name and Xi, . . . , Xj. are subsets of attributes of R whose 
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union contains all attributes of R. A common relational algebra defini- 
tion is: R = TTxi{R) IX . . . IX TTXk{R)- The equivalent full tuple-generating 
dependency is: 

R{xi) A . . . A R{xk) A A Xi[Xir)Xj]^Xj[XjnXi\^ R{y), 

l<i,j<k 

where z[Y] is the subvector of z that corresponds to the attributes in Y, 
and y C xiU . . .Uxn such that y[Xi \ \Ji<j<i Xj] = Xi[Xi \ [ji<j<i Xj] for 
t e {!,..., A;}. 

(3) Denial constraints: universal constraints with no atoms in the rhs (m = 
0): 

Ri{xi) A ... A Rn{xn) A (p{x) — > false. 

(4) Functional dependencies (FDs) commonly formulated as R:X — > 
where X and Y are sets of attributes of R. An FD R:X — > y is ex- 
pressed by the following denial constraint: 

R{xi) A R{x2) A xi[X] = X2[X] A -^{xi[Y] = X2[Y]) false. 

The following restriction will allow us to identify a tractable class of integrity 
constraints. 

Definition 1 (Acyclic constraints) The dependency graph T>{S, F) of a 

set of universal constraints F is a directed graph whose set of vertices is the 
relational schema S and for any constraint (2) in F there is an edge from Pj 
to Ri for every i G {1, . . . , n} and every j G {1, . . . , m}. The set of constraints 
F is acyclic ifV{S, F) is an acyclic graph. 

Wc also adapt the standard notions of height and depth of a node in a tree to 
(possibly cyclic) dependency graphs [18]. Given a dependency graph V{S, F), 
the acyclic depth of a node R, denoted depth{R), is the maximal length of a 
directed acyclic path that ends in R, where the length of a path is the number 
of edges its comprises of. The acyclic height of R, denoted height{R), is the 
maximal length of a directed acyclic path that originates in R. The acyclic 
height of Vi^S, F) is maximum acyclic height of all node in V{S, F). We note 
that both the acyclic height and acyclic depth of a node are bounded by the 
acyclic height of the dependency graph. 

Example 3 Figure 3 presents a dependency graph for schema 

S = {R(A, B), P(A, B),T{A, B, C), S(A, B, C)} 

and a cyclic set of constraints 

F = {R{x, y) A P{y, z) S{x, y, z), S{x, y, z) T{x, y, z), 
T{x,y,z)^P{x,y)VP{y,z)}. 
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R 


P 


S 


T 


depth 


3 


2 


2 


2 


height 





3 


2 


2 



P ^ T 

\1 
S 

Fig. 3. A cyclic dependency graph. 
The acyclic height of this graph is 3. 

Database consistency is defined in the standard way. 

Definition 2 Given a database instance I and a set of integrity constraints 
F , I is consistent with F if I \= F in the standard model-theoretic sense; 
otherwise I is inconsistent. 



We observe that because wc do not allow relation names of arity and we con- 
sider universal constraints satisfying the safety requirement, the constraint (2) 
can have negative atoms only if it has positive atoms as well, i.e. m > implies 
n > 0. Therefore, the prerequisite of a constraint violation is the existence of 
some facts in the database. Consequently, if the instance is empty then all 
the constraints are satisfied. Wc note this conforms to the behavior of typical 
SQL database management systems: an empty database satisfies any set of 
constraints expressed in SQL. 



2.2 Queries 



In this paper we deal only with closed queries, i.e. closed i2-formulas. The 
query answers are Boolean: true or false A query is atomic (quantifier-free) 
if the >C-formula is atomic (quantifier-free respectively). A conjunctive query 
is an existentially quantified conjunction of atomic ^-formulas. 

Definition 3 true is the answer to a closed query Q in an instance I if 
I 1= Q; otherwise the answer to Q is false. 



2.3 Repairs 



Repairs are defined as consistent instances that are minimally different from 
the original one. The differences are measured in terms of the set of facts that 
need to be deleted and inserted. Because we view an instance as the set of 
facts, the (symmetric) difference A(/i, I2) of the instances Ji and I2 is defined 

as A(/i,/2) = /l\/2U/2\/l. 
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Given an instance /, the relative proximity relation <i on instances is defined 
as 

h <ih ^ A(/,/i) C A(/,J2). 
We note that </ is a partial order and we write /i </ I2 if h </ h and 
h 7^ h. 

Definition 4 ([3]) Given a set of integrity constraints F and database in- 
stances I and I', we say that I' is a repair of I w.r.t. F if I' is a <i-minimal 
instance consistent with F . By Repairs{I, F) we denote the set of all repairs 
of I w.r.t. F. 

Because an empty instance over a schema always satisfies any set of universal 
constraints over the schema, the set of repairs is guaranteed to be non-empty. 

2.4 Consistent query answers 

Finally, we use the repairs to define the consistent query answers. 

Definition 5 ([3]) true (false) is the consistent answer to a closed query Q, 
denoted I \=f Q Q resp.) if and only if true (false resp.) is the answer 

to Q in every repair of I w.r.t. F. 

We note that our approach can be easily extended to open queries along the 
lines of [15,16]. In essence, from an open query Q{x) wc derive a query Q^{x) 
that defines an envelope, a superset of consistent query answers to Q{x). For 
every tuple t from the envelope, i is a consistent answer to Q{x) if and only if 
true is the consistent answer to the closed query Q{t). 

2.5 Decision problems 

We consider here the following complexity classes: 

• PTIME: the class of decision problems solvable in polynomial time by de- 
terministic Turing machines; 

• coNP: the class of decision problems whose complements are solvable in 
polynomial time by nondeterministic Turing machines; 

• : the class of decision problems whose complements are solvable in poly- 
nomial time by nondeterministic Turing machines with an NP oracle. 

To investigate tractability of the framework of consistent query answers we 
use the notion of data complexity [38]. This notion allows to describe the 
complexity of the problems in terms of the size of the database only and 
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assume the remaining parts of the inpiit to be fixed. There are two classical 
decision problems that are investigated in the context of consistent query 
answers [15]: 

(i) repair checking - determining if a database instance is a repair of a given 
database instance, i.e. the complexity of the following set 

Bp = {(/,/') I /' e Repair s{I,F)}. 

(ii) consistent query answering - determining if true is the consistent answer 
to a given closed query in a given database w.r.t. a given set of integrity 
constraints, i.e. the complexity of the following set 

Vp,Q = {/ I / Q}. 



3 Basic constructions and facts 

In this section we generalize the conflict hypergraph for denial constraints [5,15] . 
In the scope of this section, we fix an instance / and a set of universal con- 
straints F. 

3.1 Eoctended conflict hypergraph 

For denial constraints, a conflict is a set of facts whose presence violates a 
constraint. For universal constraints, a conflict is created not only by the 
presence of some facts but also by the simultaneous absence of other facts. 

Definition 6 (Conflict) A set of literals 

{Ri{ti), . . . , Rn{t 

is a conflict w.r.t. a constraint 

VX. ^[Ri(xi) A ... A Rn(Xn) A ^Pi(j/i) A ... A ^PmiVm) A (p(x)], 

if there exists a ground substitution 9 of variables x such that 6{xi) = ti for 
i e {1, . . . , n}, 0{yj) = Sj for j E {1, . . . , m}, and ip{9{x)) holds. 

If the constraints are limited to denial constraints, conflicts can be resolved 
only by deleting facts. Moreover, deleting a fact will not create further con- 
flicts. Therefore, only conflicts created by the facts from the original instance 
need to be considered. 
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In the case of universal constraints, the picture is more complex. First, a 
conflict can be resolved not only by deleting facts but also by inserting a 
fact. Second, deleting a fact can create conflicts caused by the absence of the 
fact. Similarly, inserting a fact can create conflicts caused by the presence of 
the fact. Moreover, a cascading propagation of conflicts can easily take place: 
resolving one conflict leads to the creation of another conflict whose resolution 
leads to yet another one, and so on. 

Example 4 Consider a schema S — {R{A,B),P{C)} with one constraint 
F = {R{x,y) A P{x) P{y)} and take I = {i?(l, 2), i?(2, 3), P(l)}. The 

conflict 2), P(l), -iP(2)} can be resolved by inserting P{2) into the in- 

stance I to obtain: I U{P{2)}. This creates the conflict {R{2,3), P{2),^P{3)} 
which can be resolved by further inserting P{3). Finally, we obtain the repair 
h = /U{P(2), P(3)}. The other repairs are h = /\{P(1, 2)}, h = I\{P{1)}, 
and /4 = / U {P(2)} \ {i?(2, 3)}. 

Clearly, to capture all repairs it is not enough to consider only the facts present 
in the original database instance. Also the facts potentially inserted need to 
be considered. We capture the set of relevant facts in the following way. 

Definition 7 (Hull) The hull is the minimal set of literals HuU{I,F) satis- 
fying the following conditions: 



(2) if a set e is a conflict w.r.t. a constraint in F such that every fact of e 
belongs to Hull{I,F), then for every ~^P{t) in e, both ~'P{t) and P{t) 
belong to HuU{I, F). 

We note that this definition can be easily translated to a negation-free Datalog 
program which computes the set of literals in the hull. The arities of the 
predicates are equal to the arities of the corresponding relation names. 

Example 5 For the set of constraints F from Example 4 we construct the 
following Datalog program: 



Now, if we treat the instance I as the extensional database, the program above 
has the following solution (least fixpoint): 



/U{i?^(l, 2), i?^(2, 3), U {P^(2), P^(3), P^(2), P^(3)}, 

which corresponds to Hull{I, F) = / U {P(2), ^P(2), P(3), ^P(3)}. 



(1) I C Hull{I,F) 



R"{x,y) 
P"{x) 



R{x,y). 
P{x). 

R"(x,y),P"{x). 
R"{x,y),P"{x). 
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Our intention is to use hyperedges to restrict the sets of literals used to con- 
struct a repair so that no conflicts are present. Because the hull may contain 
a fact and its negation, we also use an edge to prevent us from considering 
these two together. 

Definition 8 (Extended conflict hypergraph) The extended conflict hy- 
pergraph G{I, F) is a hypergraph whose set of vertices is Hull{I, F) and whose 
set of hyperedges consists of the following two types of sets: 

(1) conflict hyperedges e C Hull{I,F) such that e is a conflict w.r.t a con- 
straint in F, 

(2) stabilizing edges {P(t), -iP(t)} such that both P{t) and ~'P{t) belong to 
Hull{I,F). 

An independent set of the extended conflict hypergraph G{I, F) is any subset 
of Hull{I , F) that contains no hyperedges. M is a maximal independent set if 
there exists no independent set M' C Hull{I, F) such that M C M' . 

Example 6 Figure 4 contains the extended conflict hypergraph Gi for the 
instance I w.r.t the set of universal constraints F from Example 4- A dotted 

P(2) -P(3)) P(3) 

Fig. 4. The extended conflict hypergraph Gi for / and F from Example 4. 

line is used for stabilizing edges connecting a fact and its negation (if present). 
We observe that deleting P{1) does not lead to a conflict, and consequently, 
is not present in the hull. 

Since we assume the set of constraints to be fixed, the cardinalities of each 
conflict in G{I, F) are bounded by the maximal number of atoms used in a 
constraint definition, a constant K. To construct the set of hyperedges we 
need to consider all subsets of cardinality bounded by K. Consequently, 

Proposition 1 The extended conflict hypergraph G{I, F) can be constructed 
in time polynomial in the size of I (data complexity). Also, the size ofG{I,F) 
is polynomial in the size of L 

The presented extension of the conflict hypergraph is backward-compatible 
with [15]: if we restrict the set of constraints to denial constrains only, the hull 
is equal to the original instance and we obtain the standard conflict hyper- 
graph [5,15]. Finally, in the presence of denial constraints, because the repairs 
are obtained by deleting facts only, the repairs are maximal consistent subsets 
of the original instance. 
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Proposition 2 ([15]) If F is a set of denial constraints, each repair of I 
w.r.t. F corresponds to a maximal independent set of G{I, F), and vice versa. 

The same equivalence does not hold for the extended conflict hypergraph. For 
instance, while the repair Ii (Example 4) is a maximal independent set of 
Gi (Fig. 4), the repairs /2, /s, and 1^ are not. One reason for this is the use 
of negated facts in the extended conflict hypergraph. We observe, however, 
that if we complement I2 = {i?(2, 3), P(l)} with the (relevant) negations 
of facts that are not present in I2, we obtain a maximal independent set 
{R{2, 3), P(l), ^P(2), ^P(3)}. This holds for every repair. 

Proposition 3 For any repair I' G Repair s{I, F) the set 

Compl(I') = /' U {^R{t) e Hull{I,F) \ R{t) ^ I'} 
is a maximal independent set of G{I, F) . 

Proof: Naturally, Compl{I') is independent because /' is consistent. Before 
showing that Compl{I') is a maximal independent set we make two observa- 
tions following from the construction of the hull: 

1° For every ~iR{t) in Hull{I, F), the fact R{t) is also present in Hull{I, F). 
2° For every fact R{t), if -^R{t) is not present in Hull{I, F), then R{t) G /. 

Now, take any fact p G Hull{I,F) \ Compl^I'). li p = ^R{t), then R{t) G 
Compl{I') and adding ~'R{t) to Compl{I') introduces the stabilizing edge 
{i?(t), -ii?(t)}. Analogously, we show that Compl{I') cannot be extended with 
a fact R{t) whose negation is present in Hull{I, F). The only remaining case 
is extending Compl{I') with a fact R{t) that belongs to /. We observe that 
for R{t) e I\r, we have /' U {R{t)} </ /' and so /' U {R{t)} is inconsistent. 
Consequently, Gompl{I') U {R{t)} contains a conflicting hyperedge. □ 

We note that the converse of Proposition 3 is not necessarily true, i.e. for a 
maximal independent set M of the extended conflict hypergraph, its positive 
projection — {R{t) \ R(t) G M} needs not to be a repair. For instance, 
if we take the maximal independent set M = {i?(l, 2), ^F(2), P(2, 3), ^P(3)} 
of Gi (Fig. 4), its positive projection M"*" = {i?(l, 2), i?(2, 3)} is not a repair. 
Nevertheless, the extended conflict hypergraph allows us to capture all the 
repairs. And since its size is polynomial in the size of /, we consider it to be 
a compact representation of the repairs of /. 

Proposition 4 For any maximal independent set M of G{I, F) either M+ is 
a repair of I w.r.t. F or there exists a maximal independent set N of G{I, F) 
such that N+ <i M+. 

Proof: Take any maximal independent set M of G{I, F) such that M"*" is not 
a repair. Naturally, M+ is consistent and therefore there exists a repair I' such 
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that /' </ M+. It suffices to note that Compl{I')~^ = I' and by Proposition 3 
Compl{I') is a maximal independent set of G(J, F). □ 

3.2 Grounding constraints 

Often, we will find it more convenient to view confiict hyperedges as grounded 
integrity constraints. This helps to pinpoint the exact reasons for integrity 
violations and the facts that can be inserted and deleted to resolve the conflict. 

Definition 9 For any conflict hyperedge 

. . . , Rn{tn)i ~'-Pl('Si), . . . , ~'Pm (tm)} 

in G{I, F) the implication 

A ... A Rn{tn) Pl(si) V ... V Pm{Sm) 

is a ground rule (or simply rule) in G{I,F). By Rules{I,F) we denote the 
set of all ground rules in G{I, F). 

A denial (full TGD, JD, or non-JD, resp.) rule is a rule obtained from a conflict 
w.r.t. a denial constraint (full TGD, JD, or non-JD resp.) The facts in the Ihs 
and the rhs of a ground rule are represented as sets, i.e. no particular order is 
assumed and duplicates are removed. 

Naturally, the cardinality of the set of the ground rules is equal to the number 
of the conflict hyperedges, and thus it is polynomial in the size of /. We 
also note that when considering the instances using facts from the hull only, 
satisfaction of the set of constraints F implies the satisfaction of Rules{I, F). 
The converse is also true because the hull contains all relevant facts. 

Proposition 5 For any instance J such that J C Hull{I, F), J \^ F if and 
only if J Rules{I, F). 



4 Repairing in the presence of full tuple-generating dependencies 

Now, we show that repair checking in the presence of full tuple-generating 

dependencies and denial constraints is tractable. We use the result to construct 
a complete and sound repairing algorithm. In the scope of this section we 
fix an instance / and a set F of denial constraints and full tuple-generating 
dependencies. 
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4-1 Repair checking 



We begin by presenting an alternative characterization of repairs w.r.t. a set 
of full TGD and denial constraints. If we view full TGDs as Datalog programs 
we can use the standard consequence operator to identify the facts that need 
to be added to satisfy the constraints. 

Definition 10 For a set of facts J C Hull{I,F), the operator 0/ immediate 
consequence of F on J is defined as 

Tf{J) = J U {P{s) \Ri{ti) A ... A R{tn) P{s) e Rules{I, F) s.t. 
{Riti),...,Ritn)} C J}. 

Tp is defined as the transitive closure ofTp. 

It is a classical result that Tp(J) can be computed in time polynomial in the 
size of J [6]. Now, we present an alternative characterization of a repair w.r.t. 
a set of full TGDs and denial constraints. Essentially, every repair is obtained 
by closing under Tp some subset of the original instance and verifying that 
the resulting instance is consistent. 

Lemma 1 I' is a repair of I w.r.t. F if and only if the following conditions 
are satisfied: 

(i) I' is consistent, 

(u) Tp(rni) = r, 

{Hi) there is no R{t) e / \ /' such that J' = Tp{I' U {R{t)}) is consistent and 

j'\i^r\i. 

Proof: For the only if part /' e Repairs{I,F) implies that (i) holds. 

To show (ii) we note that Tp{I fl /') by definition is the minimal set that 
contains I D I' and satisfies all full TGDs from F. Hence, T*{I D I') C I' 
and, as a subset of a consistent instance /', Tp{I fl /') satisfies also all denial 
constraints from F. We also note that Tp{I fl /') and /' agree on the facts in 
/. /' is the <7-minimal consistent instance that contains all of / fl /' and none 
of / \ /', which implies that /' = T*(/ n /'). 

To show (iii) we take any R{t) G / \ /' such that J' = T^{r U {R{t)}) is 
consistent. Since /' is a <7-minimal consistent instance, J' ^/ /'. This implies 
J'\I ^r\I because I\J' CI\r. 

For the if part take any </-minimal consistent instance /" such that I" </ I'. 
Such instance exists because by (i) /' is consistent. /" is a repair, and it satisfies 
(ii). Therefore it suffices to show that /' fl / = /" fl /. /" </ /' shows directly 
that /' n 7 C 7" n I. This also shows that I' C /". 



17 



To show r'nl C I'nl, suppose there exists R{t) e /"fl/ such that R{t) I'Dl. 
Naturally, r^(/' U {R{t)}) C I" and since /' C J", we have 

Tp{I' U {Kit)}) \I CI"\I CI'\I. 

On the other hand, we note that R{t) G /\/' and Tp{I' U {R{t)}) is consistent 
as a subset of a consistent instance closed under Tp. By (iii), we obtain 

T;{I'u{R{t)})gI'\I, 

which is a contradiction. Thus, /' = /" and /' is a repair. □ 

We observe that the conditions of Lemma 1 can be checked in time polynomial 
in the size of the instances / and I'. Consequently, 

Theorem 1 Repair checking is in PTIME for any set of denial constraints 
and full tuple- generating dependencies. 

4-2 Constructing a repair 

In this subsection we present a polynomial-time algorithm for constructing 
repairs of an instance w.r.t. a set of full TGDs and denial constraints. Rather 
than trying to resolve all the conflicts present in the instance, the algorithm 
constructs a repair from scratch: it begins with an empty instance, iterates 
over the facts of the original instance, and for every fact makes a decision 
whether to discard the fact or to add it to the constructed instance. It should 
be noted that those two actions, although related, are different and should 
not be confused with inserting and deleting facts from the original instance in 
order to resolve conflicts. 

Before we present the algorithm for full TGDs and denial constraints, we re- 
call a simpler algorithm [36] that constructs repairs in the presence of denial 
constraints. Algorithm 1 constructs a maximal consistent subset of the input 

Algorithm 1 Constructing a repair of / w.r.t. a set of denial constraints F 
1:1"^ I 
2: J ^ 

3 : while I" ^ do 
4: choose R{t) e 1° 

5: r^r\{R{t)} 

6: if JU{i?(i)} ^ F then 
7: J^JU{R{t)} 
8 : return J 



instance by iterating over the facts of the input instance (using /" to store the 
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remaining facts) and adding the current fact if doing so does not violate the 
constraints. Since maximal consistent subsets of the original instance corre- 
spond to maximal independent set of the conflict hypergraph, by Proposition 2 
this algorithm always returns a repair, i.e. it is sound. We also note that it 
is complete, i.e. every repair of the original instance can be constructed: it 
suffices to chose first the facts of the desired repair. 

An approach that constructs a maximal consistent subset of the original in- 
stance is also sound for constructing a repair in the presence of denial con- 
straints and general TGDs [29]. However, in the presence of TGDs a repair 
needs not be a subset of the original instance and consequently this approach 
is not necessarily complete. 

Example 7 For the schema S — {R{A),P{A)} with a set of constraints F — 
{R{x) P{x)} consider the instance I — {i?(l), it!(2)}. This instance has 
four repairs w.r.t. F: 

I[ = 0,I^ = {R{1),P{1)}, I', = {i?(2), P(2)}, /: = {i?(l), P(l), i?(2), P(2)}. 

Only the repair I[ is a subset of the original instance. 

Our approach extends Algorithm 1 by allowing it also to add a fact together 
with the facts imphed by full TGDs. In this way, for instance, the repair I2 
is obtained by adding R{1) together with P(l) and discarding R{2). Hence, 
from now on when we add a fact, we implicitly add the facts that are required 
to satisfy full TGDs (i.e., we keep the instance closed under Tp). 

In the general scenario, because of the complex interaction among facts, the 
decision whether to add or to discard a fact becomes quite intricate. We illus- 
trate this in the following example. 

Example 8 We take the schema S — {R{A, B,C), P{A, B)} with a set of 
constraints F — {R{x, y, z) — > P{x, y),P: A ^ B}. 

First, we consider the instance Ii = {P(l, 1), P(l, 2, 1)}. We start with an 
empty instance and begin with the fact P(l, 1). We add it as doing so does not 
violate the constraints. Adding the next fact i?(l,2,l) would require adding 
also the fact P(l,2). This would, however, create a conflict. Hence, we must 
discard -R(l,2, 1). The obtained instance {P(l, 1)} is a repair of Ii. 

Now, let's consider the instance I2 — {P(l, 2, 1), P(l, 2, 2)} and begin with 
the fact P(l,2,l). Adding the fact -R(l,2, 1) would require adding also the 

fact P(l, 2) which is not present in the instance constructed so far. Therefore, 
we can consider both adding and discarding P(l,2, 1). We decide to discard 
it, but we note that the set of facts {P(l,2)} cannot become included later 
on in the constructed instance. We store it on the list of banned sets. Intu- 
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itively, a banned set contains tuples whose mutual absence in the constructed 
instance justifies discarding some other tuple. Consequently, we must prevent 
adding any tuples which may cause a banned set to be included in the con- 
structed instance. For instance, adding the next fact i?(l,2,2) would require 
adding also P{1,2), and cause inclusion of the banned set {P(l,2)}. Hence, 
we discard i?(l,2,2), and finally, obtain the empty repair 0. We observe that 
if the fact i?(l,2,2) were added {together with P(l,2)), the obtained instance 
{i?(l, 2, 2), P(l, 2)}, although consistent, would not be <i.^-minimal (the repair 
{i?(l, 2, 1), R{1, 2, 2), P(l, 2)} is relatively closer to h). 

Finally, we consider the instance F, = 1, 1), P(l, 1), P(l, 2)} and begin 

with the fact R{1, 1, 1). Discarding it would require memorizing the banned set 
{P(l,l)}. However, this set is not appropriate for our purposes because the 
fact P(l, 1) is present in the original instance and later we might be forced to 
add it to the constructed instance. Hence, we add the fact -R(l, 1, 1) (together 
with P{1,1)). Next, we add the fact P(l,l) but ignore the fact P(l,2). The 
constructed instance {P(l, 1, 1), P(l, 1)} is a repair. 

Now, we present a sound and complete Algorithm 2 constructing a repair of 
a (possibly inconsistent) instance / w.r.t. a set F of full TGDs and denial 
constraints. It starts with an empty instance J and iterates over the facts of 

Algorithm 2 Constructing a repair of / w.r.t. a set of full TGDs F 

2: J ^ 

3 : Banned ^ 

4 : while /" 7^ do 

5: choose R{t) e 1° and h e {true, false} 

6: F^F\{R{t)} 

7: J' ^ T*p{J U {R{t)}) 



10: else J ^ J' 
11 : return J 



the original instance /. Banned is a collection of banned sets of facts, i.e. sets 
that are not to be included in the constructed instance J. We note that some 
elements of those sets can, however, be included in J. 

For every fact R{t) the algorithm makes a choice b whether or not it should 
try discarding the fact. Here, this choice is nondeterministic but, in practice, 
it could be based on the user preference. The fact R{t) is discarded if one of 
the following conditions is satisfied: 
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(*) Adding R{t) violates the constraints. 
(**) Adding R{t) does not violate constraints, but b is set to true and adding 
R{t) implies adding facts that are not present in J and /. 
(***) Adding R{t) leads to inclusion of some previously created banned set. 

If the fact is discarded even though adding it does not violate the constraints, 
a banned set is added to Banned (line 9). Finally, if none of the conditions 
above is satisfied, the fact is added to J (line 10). 

Before proving that Algorithm 2 is sound and complete, we make several 
observations. First, we note that J is always closed under Tp. Moreover, J is 
always consistent because the condition (*) ensures that facts are added to 
J only if doing so does not violate the constraints. Finally, the main loop of 
Algorithm 2 satisfies the following invariant: 

Inv = V5 e Banned. B % J. 

Indeed, the invariant is trivially satisfied before the execution enters the main 
loop. Also, the condition (***) ensures that facts are added to the constructed 
instance J only if doing so does not violate Inv. Hence, we need only to check 
that creating a new banned set does not violate the invariant. We observe that 
a new banned set is created only if (**) or (***) are satisfied. (**) implies 
directly that the new banned set satisfies Inv. (***) implies this implicitly. In 
this case there exists a banned set B such that B C J' = Tp{J U {R{t)}). We 
note that all banned sets contain no fact from I. B <^ J hy Inv. Thus, the 
newly created banned set B' — J' \ (I \J J) is not included in J. 

Theorem 2 Algorithm 2 is a sound and complete repairing algorithm for any 
instance and any set of denial constraints and full tuple- generating dependen- 
cies. Algorithm 2 works in time polynomial in the size of the input instance 
I. 

Proof: Soundness. We show that for any execution of Algorithm 2 the re- 
turned instance, denoted here by /', satisfies the conditions (i), (ii), and (iii) 
of Lemma 1. 

The conditions (i) and (ii) are satisfied trivially because, as observed before, 
at every time the constructed instance J is consistent and closed under Tp. 

To show (iii) we take any R{t) such that J' = T^(/' U {R{t)}) |= F. 

Consider the iteration of the main loop during which R{t) was chosen and note 
that J C /'. We observe that the condition (*) is not satisfied, but since R{t) 
is not added to /', (**) or (***) is. Consequently, a banned set B = J'\{IU J) 
is added to Banned. It is easy to see that B <Z J' \ I. On the other hand, Inv 
imphes that B%r\I, which proves that J'\I %r\I. 
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Completeness. Take any repair /' G Repairs(I , F) and consider an execution 
of Algorithm 2 during which: 1) in the first phase, it selects all facts from I Hi' 
and sets b to false; 2) in the second phase, it chooses all facts from / \ /' and 
sets b to true. We show that this execution returns 

First, we consider the facts chosen in the first phase and we show that none 
of the conditions (*), (**), and (***) is satisfied. Indeed, the condition (*) is 
not satisfied because /' is consistent (and so is any of its subsets closed under 
Tp). Trivially, the condition (**) is also not satisfied because b is chosen to 
be false. The condition (***) is not satisfied because during the first phase 
Banned remains empty. Note that the instance J obtained after the first phase 
consists exactly of the facts of /' n J closed under Tp. By (ii) of Lemma 1, 

j = r. 

Now, we show with a simple inductive argument that none of the facts from 
/\/' are added in the second phase. By (iii) of Lemma 1, for a fact R{t) G /\/', 
if (*) is not satisfied, then Tj^{J U {R{t)}) \ I ^ J\I. This implies that 
J U {R{t)}) \ (/ U J) is nonempty, i.e. (**) is satisfied {b is true). 

We finish the proof by showing that Algorithm 2 works in time polynomial in 
the size of /. First, we observe the algorithm iterates over the facts of /. For 
every fact it creates at most one banned set, and so the cardinality of Banned 
is bounded by the size of /. Each banned set is a subset of Hull{I, F), and thus 
of polynomial size as well. Consequently, for every fact all of the conditions 
(*)) (**)) ^-nd (***) can be checked in polynomial time. □ 



5 Consistent query answering for full TGDs 

In this section we investigate consistent query answering in the presence of full 
tuple-generating dependencies and denial constraints. We begin by presenting 
a polynomial algorithm for computing consistent answers to quantifier-free 
queries in the presence of acyclic full TGDs and denial constraints. Next, we 
extend this approach to handle join dependencies as well. Finally, we show 
that for arbitrary full TGDs consistent query answering is coNP-complete. 

5. 1 Warm-up: acyclic full TGDs and denial constraints 

In this section wc extend the algorithm computing consistent query answers 
to closed quantifier-free queries in the presence of denial constraints [15]. The 
main idea of the algorithm is to check if there exists a repair that does not 
satisfy the query, i.e. satisfies the negated query. The negated query specifies 
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the facts that need to be present and the facts that need to be absent in the 
repair. To find if the repair in question can be constructed wc devise supports 
and blocks of the facts that need to be respectively present and absent in the 
repair. Intuitively, a support of a fact is a set of facts from the original instance 
that, if contained in the repair, guarantees the presence of this fact. Conversely, 
a block of a fact specifics facts that lead to a conflict with this fact and so 
their presence guarantees the absence of the fact in the repair. Additionally, a 
block can specify facts that must not be included during the repairing process. 
Finally, we show how to check that a combination of supports and blocks can 
be realized in the same repair. 

We fix an instance / and a set F of denial constraints and acyclic full tuple- 
generating dependencies. For brevity, we use inference-like rules to define the 
supports and blocks of a fact. A rule of the form ^ reads: "B provided A". 
Also, in A we often use ground rules which implicitly belong to Rules{I, F). 

Definition 11 (Support) A support of a fact R{t) e Hull{I,F) is a subset 
of I defined with the following rules: 

R{t) G / 
{R{t)} e Supp{R{t)) 

Rl{t,) A . . . A Rn{tn) ^ R{t) 

R{t)^I Si e Supp{R,{ti)) yte {!,..., n} 
[j,SieSupp{R{t)) 

where Supp{R{t)) is the set of all supports of R{t). 

Essentially, a support of a tuple from the original instance is a singleton con- 
sisting of that tuple (rule Sq). If a tuple does not belong to the original instance 
but there is a full TGD rule having the tuple in its rhs, then a support of that 
tuple is a union of the supports of the tuples in the Ihs (rule Si). 

Example 9 We take the schema S = {R{A, B,C), P{A, B),Q{A)} with the 
set of constraints F — {R{x,y,z) —>■ P{x,y), P{x,y) —>■ Q{x), P : A — > B}, 
and consider the instance I = {R{1, 1, 1), -R(l, 2, 1), P(l, 2), (5(2)}. The hull is 
Hull{I, F) = J U {P(l, 1), Q{1)}. The set of ground rules is 

Rules{I, F) = {R{1, 1, 1) ^ P(l, 1), P(l, 2, 1) ^ P(l, 2), 

p(i,i)^g(i),p(i,2)^Q(i), 

P(l,l) AP(1,2) ^ false}. 
Repair s{I, F) consists of the following instances: 

i[ = {Q(2)}, = {p(i, 1, 1), p(i, 1), g(i), g(2)}, 

/H{P(1,2,1),P(1,2),Q(1),Q(2)}. 
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The facts from I have simple supports obtained with the rule Sq.' 

Supp{R{l,l.l)) = {{i?(l,l,l)}}, Supp{P{l,2)) = {{P(l,2)}}, 
Supp{R{l, 2, 1)) = {{R{1, 2, 1)}}, Supp{Q{2)) = {{Q{2)}}. 

The fact P(l, 1) has only one support obtained with the rule Si; 

supp{p{i,i))^{{R{i,i,m. 

Finally, Q{1) has two supports: 

Supp{Q{l)) = {{R{1, 1, 1)}, {P(l, 2)}}. 



The supports of a fact define conditions that ensure that it is present in the 
repair. 

Proposition 6 For every I' e Repairs{I, F) and every R{t) e Hull{I, F) 
R{t) el' ^ 3S e Supp{R{t)).S c /'. 



The proof is by a simple induction over the position of the relation name R 
in a topological sort of the dependency graph V{S,F). The proof of a more 
general claim (Proposition 10) can be found in Appendix A. 

Blocking a fact is more complex. First, the facts that are not present in the 
original instance / need to be added in the process of creating a repair. In this 
case we can exphcitly forbid adding this fact (rule Bq). For instance, the fact 
(5(1) can be blocked this way. Facts that belong to / can be blocked using 
conflicts they are involved in. If a fact is involved in a denial conflicts then it 
is blocked by the presence of other facts that together lead to a conflict (rule 
Bi). Finally, blocks can be propagated using full TGDs (rule B2). 

Consequently, a block of a fact consists of two sets: one indicating the facts 
from / that need to be present in the repair and the other one indicating a 
fact that must not be added to the repair. 

Definition 12 (Block) A block of a fact R{t) E HuU{I, F) is a pair that 
consists of a subset of I and a set of at most one fact from Hull {I, F) \ I, 



24 



defined with the following rules: 

R{t) ^ I 
{0,{R{t)}) e Block{R{t)) 

R(t) A Riiti) A ... A Rnitn) false 
R{t) el SiE Supp{Ri{ti)) Vi G {1, . . . , n} 
{[JiSi,0) G Block{R{t)) 

R{t) A Ri{ti) A ... A Rn{tn) P{s) 
S.eSuppiMU)) VzG{l,...,n} 
R{t) G / (S,iV) G Block{P{s)) 

(U5. Ufi,A^) G Block{R{t)) 

where Block{R{t)) being the set of all blocks of R{t). 

Blocks specify the conditions that ensure a fact to be absent in a repair. 

Proposition 7 For every I' G Repairs{I, F) and every R{t) G Hull{I,F) 

R{t) ^ 3{B, N) G Block{R{t)). B C I' ANni' ^ 0. 

The proof is by induction over the position of i? in a reverse topological sorting 
of X'((S, F). The proof of a more general claim (Proposition 11) can be found 
in Appendix A. We remark, however, that acyclicity of F is essential here. 

Example 10 (cont. Example 9) The facts Q{1) and P(l,l) have simple 
blocks obtained with the rule Bq: 

Block{Q{l)) = {(0, {Q(l)})} Block{P{l, 1)) = {(0, {P(l, 1)})}. 

The fact R{1, 1, 1) has one block obtained with the rule B2: 

Block{R{l,l,l))^{{0,{P{l,l)})}. 
The fact P(l,2) has two blocks obtained with the rules Bi and B2: 

Block(P(l, 2)) = {({R{1, 1, 1)}, 0), (0, {g(l)})}. 
The fact R{1, 2, 1) has two blocks obtained with the rule B2: 

Block{R{l, 2, 1)) = {{{R{1, 1, 1)}, 0), (0, {Q(l)})}. 
Finally, the fact Q{2) has no blocks (it does not participate in any conflict): 

Block{Q{2)) = 0. 
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The following proposition ensures tractability of our approach. 

Proposition 8 For any fact the number of all of its supports and the number 
of all of its blocks can be computed in time polynomial in the size of the instance 
I. 

This claim is proved with a simple combinatorial argument. Also here, the 
acyclicity of F is essential. The proof of a more general claim (Proposition 12) 
can be found in Appendix A. 

Finally, we show how to check if there exists a repair that realizes a given 
combination of supports and blocks. 

Lemma 2 For any (possibly cyclic) set of full TGDs and denial constraints 

F , an instance I, and two sets of facts P I and N C Hull{I, F)\I, a repair 
containing all facts from P and no facts from N exists if and only if Tp[P) is 
consistent and disjoint with N . 

Proof: The only if part of the proof is trivial. For the if part take any repair 
/' such that /' </ Tp{P). Such an instance exists because Tp{P) is consistent. 
We show that /' is the desired repair. First, /' </ Tp{P) and P Q I imply 
that P C J', and consequently T^(P) C /' (as /' is consistent). /' <j T^{P) 
imphes also N (1 1' — because N contains no fact from /. □ 

We use the previous results to construct Algorithm 3 computing the consistent 
answer to a quantifier-free query to Q in the instance / w.r.t. a set of denial 
constraints and acyclic full TGDs F. 

We assume that the query Q is in CNF and we note that true is not the 
consistent answer to Q if and only if there exists a conjunct of Q that is not 
satisfied by some repair. Consequently, for each conjunct Qi of Q we check 
if there exists a repair that satisfies -iQj (line 2). A negated conjunct is a 
conjunction of positive and negative atomic formulas 

-^Qi = Ri{ti) A ... A Rn{tn) A -Pi(si) A ... A P^(t^) 

and therefore a repair satisfying -iQj is a repair that contains all i?j(ti)'s 
and no Pj(sj)'s. The existence of such a repair is checked with the function 
ExistsRepair. 

Because all repairs are constructed from facts in Hull{I, F), we can assume 
that all i?i(ti)'s and Pj(sj)'s belong to Hull{I,F). Indeed, if some Ri{ti) does 
not belong to Hull{I , F), then a repair containing Ri{ti) does not exist (line 
7). Similarly, if some Pj{sj) does not belong to Hull{I , F), then no repair 
contains Pj{sj) (line 9). Using Propositions 6 and 7 we show that there exists 
a repair /' containing all i?j(tj)'s and no Pj(sj)'s if and only if for every Ri{ti) 
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Algorithm 3 Computing the consistent answer to Q in / w.r.t. F. 
function CQA(Q, J, F) 

precompute: HtiU{I, F), Supp, and Block for / and F. 
1 : let Q = Qi A . . . A /* Query m CNF */ 

2: for i 1, . . . , It; do 

3: let ^Qi = Ri{ti) A ... A Rn{tn) A -Pi(si) A ... A 

4: if EXISTSREPAIR({i?i(ii), . . . , Rn{tn)}, {-Pl(si), . . . , Pm{Sm)}) then 

5: return false 

6 : return true 
end function 

function EXiSTsREPAiR(rp, T^) 

7: if Tp 2 HuU{I,F) then 

8: return false 

9: if Ttv 2 Hull{I,F) then 
10: Tm ^TMf\Hull{I,F) 

11: let {R^{t{),...,Rn{tn)} = Tp 
12: let {Pi{s^),...,Pra{tm)}=TN 

13: for 5*1 G Supp{Ri{ti)), . . . , S'n G Supp{Rn{sn) do 

14: for iVi) G Block{Pi{si)), {B^, N^) G Block{Pm{sm)) do 

15 : P ^ U . . . U S'„ U Si U . . . U 

16 : ^ A^i U . . . U 

17: if T*p{P) 1= F and r^(P) n TV = then 

18: return true 

19: return false 
end function 



there exists a support Si and for every Pj{sj) there exists a block {Bj,Nj) 
such that /' contains all 5'j's and Sj's and is disjoint with every A^'s. 

Hence, it suffices to exhaustively enumerate over all combinations of supports 
of i?j(tj)'s (line 13) and blocks of Pj(sj)'s (line 14) and use Lemma 2 to check 
if a combination can be realized by a repair (line 17). 

Finally, to show that Algorithm 3 works in time polynomial in the size of /, 
we note that the size of the query is considered to be a fixed constant and by 
Proposition 8 the number of supports and blocks of every fact is polynomial in 
the size of /. We remark that acyclicity is used only in Propositions 7 and 8. 

Theorem 3 Consistent query answering is in PTIME for any quantifier-free 
query and any acyclic set of denial constraints and full tuple- generating de- 
pendencies. 

Example 11 (cont. Example 10) We execute Algorithm 3 with the follow- 
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ing query: 

Q = (g(l) V 1, 1)) A (Q(2) V -P(l, 2)) A iR{l, 2, 1) V -P(l, 2)). 

The negation of the first conjunct is -iQi = i?(l,l,l) A -^Q{1). The fact 
1, 1) has only one support 1, 1)} and the fact Q{1) only one block 

(0, {Q(l)}). Although 1, 1)}) = 1, 1), 1), (5(1)} «s conszs- 

ten^, contains Q{1). Hence, there does not exists a repair that satisfies -'Qi- 

The negation of the second conjunct is —>Q2 = -^(1,2) A -iQ(2). Because the 
fact Q{2) has no block, there is no repair that does not contain Q{2), and 
consequently there does not exists a repair satisfying -'Q2- 

The negation of the third conjunct is ^Qs = F(l,2) A -ii?(l,2,l). The fact 
P{1, 2) has only one support {R{1, 1, 1)} and the fact R{1, 2, 1) has two blocks: 
(0,{(5(1)}) and (i?2i^2) = ({-R(l) li 1)}) 0)- Similarly to -iQi, combining 
the support with the first block does not guarantee the existence of a repair 
satisfying However, if we use the support with the second block, then 

P — 1, 1)} and N — satisfy Lemma 2 which implies that there exists 

a repair satisfying ->Qz. Indeed, this repair is /g. 

Consequently, the query Q does not hold in the repair I2 and true is not the 
consistent answer to Q. 

5.2 Adding join dependencies 

In this section we extend Algorithm 3 to inchide also JDs. For this we general- 
ize the definitions of supports and blocks and we show that Propositions 6, 7, 
and 8 continue to hold. Here, we present only the constructions and main 
claims. Complete proofs are presented in Appendix A. 

The following folklore result shows that we need to consider the case where 
there is only one JD per relation. 

Proposition 9 For any (possibly empty) set of JDs {jdi, . . . , jdn} on the 

same relation name there exists a JD jd* such that an instance satisfies 
{jdi, . . . ,jdn} if and only if it satisfies jd*. 

We remark that in particular every relation R in every instance satisfies the 
trivial JD R N: [attrs{R)] . Hence, we fix an instance / and a set of constraints F 
consisting of a set of acyclic full TGDs, denial constraints, and exactly one JD 
per relation. Also, to distinguish JD rules we write them R{ti) A ... A R{tn) ^ 
R{t) . We remark, however, that — > continues to be used for all rules, including 
those obtained from confiicts w.r.t. JDs. Finally, we observe that any JD 
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R\^:[Xi, . . . , Xk] yields rules R{t) ^ R{t) for all R{t) G HuU{I, F). Because 
we assume that F contains one JD on every relation in S, and (which 
subsumes -^) are reflexive on facts from Hull{I, F). 

Previously, the acyclicity of the set of integrity constraints implicitly provided 
a bound on the depth of the derivation of supports and blocks. This bound is 
essential when showing that supports and blocks can be constructed in time 
polynomial in the size of the database. Although JDs translate to cyclic full 
TGDs, it is sufficient to consider derivations of supports and blocks of bounded 
depth. 

Lemma 3 If Rules{I,F) contains the following two ground rules 

r' = R(t[) A ... A R{ti) ^ R(ti) and r" = R(ti) A ... A R{tk) ^ R(t) 

for some i G {1, . . . ,k}, then there exists j G {1, . . . ,k} such that Rules{I, F) 
contains also 

r* = R{ti) A ... A R{ti_i) A R{t'j) A R{ti+i) A ... A R{tk) ^ R{t). 

Because the set of constraints is cyclic, we construct the supports in an iter- 
ative manner allowing us to bound the derivation depth. The new rules for 
supports are obtained by appropriately incorporating JD-rules into Sq and Si 
(Definition 11). 

Definition 13 (Support) Let h be the acyclic height of the dependency graph 
V{S, F). For i G {-1, 0, . . . , /i} an ^-support of a fact R{t) G Hull{I, F) is a 
subset of I defined with the following rules: 

_i R{t) G / 

^° ■ {R{t)} G Supp-\R(t)) 

R{ti) A ... A R{tk) ^ R{t) 

Ri,l{ti,l) A ... A Ri,nAhn,) R{ti) Vz G {1, . . . , A;} 
R(t)^I Si,j e Supp'-\Rij{tij)) ViG{l,...,A;}, ViG{l,...,n,} 

Uij e Suppi{R{t)) 

where Supp^{R{t)) denotes the set of all i-supports of R{t). A support of R{t) 
is any element of the set Supp{R{t)) = Supp^{R{t)) . 

We note that because — * and are reflexive on facts from Hull{I, F), the 
rule S{ properly propagates supports, i.e any (£— l)-support of R{t) is also an 
^-support of R{t). We also note that if all the JDs in F are trivial, then the 
set of supports of a fact coincides with the set of supports from Definition 11 
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Proposition 10 For every I' e Repairs{I, F) and every R{t) e Hull{I, F) 
R{t) el' ^ 3S e Supp{R{t)).S c /'. 

The if part is proved with a simple induction over the depth of the derivation of 
a support. The proof of the only if is based on the following simple idea. A fact 
R(t) G J' \ / is present in the repair /' to satisfy some ground (full TGD) rule. 
We identify this ground rule by considering an inconsistent instance I'\{R{t)}. 
We use this rule to show that R{t) has a support that validates the claim. 

Again, because the set of constraints is cyclic, we construct the block itera- 
tively to bound their derivation depth. The new rules for block are obtained 
by appropriately incorporating JD-rules into Bq, Bi, and B2 (Definition 12). 

Definition 14 (Block) Let h be the acyclic height of the dependency graph 
V{S,F). Forie {-1,0, . . . an i-h\ock of a fact R{t) E Hull{I,F) is a pair 
that consists of a subset of I and a set of at most one fact from Hull{I, F) \ I, 
defined with the following rules: 

Rjt) ^ I 
° ■ {0,{R{t)}) e Block-^{R{t)) 

R{ti) A ... A R{tn) A Ri{si) A ... A Rm{sm) false 
R{t) A i?(ti,i) A ... A R{ti,k,) ^ R{ti) Vi e {1, . . . , n} 

Si^j e Supp{R{tij)) Vz e {1, . . . , n}, Vj e {1, . . . , h} 
_^ R{t) el Spe Supp{Rp{sp)) Vp e {1, . . . , m} 
■ (Uj Si J U Up Sp, 0) e Block-^{R{t)) 

R{ti) A ... A R{tn) A i?i(si) A ... A i?„,(s„0 ^ P{s) 
R(t) A i?(t,,i) A ... A i?(t,,fcj ^ R{t,) yze {!,..., n} 

Sij e Supp{R{t,j)) Vz e {1, . . . , n}, Vj e {1, . . . , k} 
Sp e Supp{Rp{tp)) Vp e {1, . . . , m} 
^ R{t) e I {B,N) e Block^-\P{s)) 

(Uj Si J UUpSpU B, N) e Block^{R{t)) 

where Block^{R{t)) is the set of all £-blocks of R{t). A block of R{t) is any 
element of the set Block{R{t)) = Block^{R{t)). 

Also this time, we observe that because — > and ^ are reflexive, the rule B^ 
properly propagates blocks, i.e. any (£ — l)-block of i?(t) is also an l-blook of 
R{t). We also note that this definition of blocks generalizes Definition 12. 

Proposition 11 For every I' e Repairs{I, F) and every R{t) e Hull{I,F) 

R{t) ^ J' <^ 3{B, N) e Block{R{t)). BCI' ANni' ^0. 
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The if part is proved with a simple induction over the depth of derivation of 
a block. The proof of the only if part is based on the following simple idea. 
R{t) e / \ /' is absent in the repair /' because its presence would cause a 
violation of some ground rule. We identify this rule by considering an incon- 
sistent instance /' U {R{t)}. We use this rule to show that R{t) has a block 
that vahdates the claim. 

Proposition 12 For any fact R{t) the sets Supp{R{t)) and Block{R{t)) can 
he constructed in time polynomial in the size of I. 

A simple combinatorial proof is presented in Appendix A. 

We recall that the proof of Theorem 3 rehes on the Lemma 2 and Proposi- 
tions 6, 7, and 8. The proof of Lemma 2 does not assume the set of constraints 
to be acyclic and the corresponding Propositions 10, 11, and 12 have been 
proved for generalized supports and blocks. Consequently, 

Corollciry 1 Consistent query answering is in PTIME for any quantifier- 
free query and any set of join dependencies, denial constraints, and acyclic 
full tuple- generating dependencies. 

5.3 Negative results 

It appears to be difficult to extend our approach beyond quantifier-free queries 
because of the following result. 

Theorem 4 ([15]) There exists an FD and a closed conjunctive query (using 
existential quantifiers) for which consistent query answering is coNP-complete. 

Also, the class of constraints is likely to be maximal as having even one cyclic 
full TGD that is not a JD leads to intractabihty. 

Theorem 5 There exists a positive atomic query and a set of integrity con- 
straints consisting of one FD and one cyclic full tuple- generating dependency 
for which consistent query answering is coNP- complete. 

Proof: The membership of consistent query answering in coNP follows from 
the definition of consistent query answers and Theorem 1. 

We show coNP-hardness by reducing the complement of 3C0L to consistent 
query answering. 3C0L is a classic NP-complete problem of testing if a graph 
has a legal 3-coloring [34]. A 3-coloring is an assignment of one of 3 colors 
to each vertex of the graph. It is legal if no two adjacent vertices have the 
same color. Take any undirected graph G = (V, E), and let V — {vi, . . . , Vn} 
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and E = {ei, .... e„i}. We assume that G has no isolated vertices (i.e. vertices 
incident to no edge). 

We use the schema 

S^{R{A,B,C,D),P{C)} 
with the set of integrity constraints 

F = {R:A^ B,R{xi,yi,zi,Z2)AR{x2,y2,zi,Z2)AP{zi)Ayi 7^ 1/2 ^ P{z2)}- 
We use the following facts: 

• p^j — R{i, k,j — for each vertex Vi with color k incident to the edge Cj 
(we create a separate copy for each edge incident to the vertex and for each 
color) ; 

• Qj = P{j) indicating that the edges ei,...,ej connect properly colored 
vertices (for j E {0, . . . , m}); 

• 3 special facts: r = R{n + 1, 0, m, m + 1), r' — R{n + 2, 1, m, m + 1) and 
r" = P(m+l). 

The constructed instance is: 

Ig = {Pi,j \vieV, Cj eE,Vie ej, 1 < /c < 3} U {qo, r, r'}. 

Now, we outline the interaction among the facts induced by the integrity 
constraints. The FD ensures that every vertex has at most one color assigned 
to it, i.e. for any Vi G V, any two £^^,6^2 £ E adjacent to Vi, and any two 
different colors ki,k2 G {1, 2, 3}: 

pSiApg,^ false. (4) 

The full TGD ensures that the facts qj are properly used to incrementally 
verify that all edges connect legally colored vertices, i.e. for any edge Cj — 
{vi^jVi^} and any two different colors ki,k2 G {1, 2, 3}: 

The full TGD also requires that if q^ is inserted, which indicates a legal 
coloring, then r or r' is to be deleted or r" is to be inserted: 

r Ar' Aqm—* r" . 

Consequently, the query used in the reduction checks if r is not removed from 
any of the repairs. 

Q = r. 

The main claim is that: 

G G 3C0L <^ 37' G Repairs{lG, F).r ^ I'. 
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For the only if part, let / be the legal 3-coloring of G. We construct the 
following instance 

I' = {p{!j^ \vieV, ej eE,Vie ej} U{qo,..., Qm, r'}. 

It can be easily verified that this instance satisfies (i), (ii), and (iii) of Lemma 1 
and hence I' is a repair and r ^ I'. 

For the if part, we note that since r ^ /', by (in) of Lemma 1 both qm and r' 
are present in /' and r" ^ /'. G /' implies that {qQ, . . . , g^} ^ I'- Therefore, 
for every j G {!,..., m} if ej = {vjj ,Vi,^}, there exist two different colors ki and 
/c2 such that p^^j and p'^^j are present in Moreover, for any two p^j^ and pfj^ 
we have ki — k2. Hence, the function f{i) — k such that there exists p\,j e /' 
is a properly defined legal 3-coloring of G. □ 

Finally, we note that the role of the FD can be simulated with a full TGD 
giving almost the same reduction. Indeed, if we replace the FD with the 
TGD R{x,y,Zi,Z2) A R{x,y',z[,Z2) -R(0, 0,0, 0) and by d denote the fact 
i?(0, 0,0,0), then the rules (4) are replaced by pfj^ ^ Pij2 ~^ every 
Vi & V, any two edges ej-^jCj^ adjacent to Vi, and any two different colors 
ki,k2 e {1,2,3}. We note that the fact d is not involved in any other rules, 
and therefore it is only present in a repair which assign two different colors to 
the same vertex. Now, the query needs to be augmented to check that in no 
repair r is deleted while d is not inserted, i.e. in all repairs r is absent only if 
d is present, Q — (-ir) =^ d — r y d. 

CoroIIciry 2 There exists a quantifier-free ground query and a set of two full 
cyclic TGDs for which consistent query answering is coNP-complete. 

The complexity of computing consistent answers to atomic ground queries in 
the presence of full TGDs only remains an open question. 



6 Universal constraints 

In this section we investigate the complexity of consistent query answering 
and repair checking in the presence of arbitrary universal constraints. 

Lemma 4 For any set of universal constraints F and any closed query Q, 
repair checking is in coNP and consistent query answering is in Y^. 

Proof: We observe that checking if a set of facts is a maximal independent set 
is in PTIME. The definition of a nondeterministic Turing machine checking 
if an instance /' is not a repair follows from Propositions 3 and 4. First, the 
machine constructs Compl{I') and checks if it is a maximal independent set. 
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If so, it nondeterministically attempts to construct a maximal independent set 
N such that N+ <i I'. 

The definition of a nondeterministic machine (with an NP oracle) that checks 
if true is not the consistent query answer follows from Definition 5: true is 

not the consistent answer if and only if there exists a repair where the query 
answer is false. Hence, the machine nondeterministically creates an instance 
/', verifies that /' is a repair by using the NP oracle (checking that /' is 
not accepted by the machine constructed above), and verifies that the query 
answer in /' is false. □ 

Theorem 6 There exists a positive atomic query, and a set of two FDs and 
a universal constraint for which consistent query answering is Il2-complete. 

Proof: The membership is proved in Lemma 4. We prove n2-hardness by 
reducing the problem of validity of V*3*QBF to 'Df,q. 

Consider the following V*3*QBF formula: 

— Vxi, . . . , Xn- 3Xn+l, • • • , Xn+m- 

where $ = Ci A . . . A Cfc is quantifier-free 3CNF i.e., Cj is a clause of three 
literals Lj^i V Lj^2 V Lj^^. Recall that checking the validity of V*3*QBF is a 
classical nf-complete problem [34] . 

We assume that no two clauses of $ are identical. For ease of reference, we call 
the variables Xi, . . . ,Xn universal and the variables Xn+i, ■ ■ ■ , Xn+m existential. 
We also use the following functions on the literals of $: 

var{xi) = i, sgn{xi) = 1, 

q{Xi) = q{^Xi 

varl-iXi) = i, sgn{-iXi) = 0, 

The schema contains two relation names: 

S = {R{Ai, Bi, A2, B2),D{Ai, Bi,Ci,Di,..., A^, B^, C4, D4)}. 
The set of integrity constraints is: 

F^{R:Ai^ Bi,R:A2-^ B2, 

D(xi, X2, X3, X4) R{xi) V R{X2) V R{X3) V R{X4)}, 

where each vector of 4 variables. We use the following types of facts in 

the reduction: 



1 ii i < n, 
otherwise. 
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• facts corresponding to the valuations of universal variables (i e {1, . . . , n}): 

Pi = R{i, 1,0,0) and pi = R{i, 0,0,0), 
and the valuations of existential variables {i & {n + 1, . . . ,n + m}): 
Pi = R{i, 1,1,0) and pi = R{i, 0,1,0), 

• facts corresponding to the clauses {j e {1, . . . ,k}): 

Qj = D(var(lj^i),sgn{lj^i),q{lj^i), 0, var{lj^2), sgn{lj^2), q{lj,2),0, 
var(lj,3),sgn(lj,3),q(lj,3),0, 0, 1, 1, 1), 

• 2 special facts: 

r = i?(0, 1,1,1) and f = i?(0, 0, 0, 0). 
The constructed instance is 

h = {Pl,Pl,---,Pn+m,Pn+m,(ll,---,qk,r}. 

For the clarity of further considerations by ij^p we will denote the fact corre- 
sponding to the satisfying valuation of the literal Ljp, i.e.: 

^ _ \Pi when Lj^p = Xi, 
^'^ \pi when Lj^p = -ix^. 

Now, wc outline the interaction among the facts induced by the integrity 
constraints. Wc start with the simple observation that the FD R : Ai Bi 
ensures that in every repair there is at most one fact corresponding to a 
valuation of each variable. In symbols: 

Pi A Pi ^ false for i e {1, . . . , n + m}. 

The full TGD ensures that for every conjunct if the repair does not have a 
fact corresponding to a valuation satisfying the conjunct, then the fact qj is 
deleted or the fact r is inserted: 

q. ^ V £j-2 V £,-3 V r for J G {1, ... , k}. 

Inserting r requires removing all the facts corresponding to valuations of the 
existential variables (the FD R : A2 ^ B2): 

Pi Ar ^ false and p j A r — > false for i e {n + 1, . . . , n + m}. 

It is important to note that this makes inserting r quite a drastic way to repair 
the instance Since r does not belong to and all facts corresponding to 

35 



valuations of existential variables do, < /-minimality ensures that such a way 
of repairing is considered only if for a given valuation of universal variables 
there does not exist a valuation of existential variables satisfying \E'. Also, we 
observe that inserting r requires deleting f, i.e. 

r A f — > false. 

Consequently, the query used in the reduction checks if the fact r is not deleted 
from any of the repairs: 

Q = f. 

Figure 5 contains the extended conflict hypergraph of J^r for 

\I' = Vxi, X2, X3. 3x4, x^. (-1X1 V X4 V X2) A (-1X2 V -1X5 V X3). 

The dotted lines are used for stabilizing edges. 

f 




Fig. 5. G{I^,F) for = Vxi, X2, X3. 3x4, X5. (-•xi V X4 V X2) A (-'X2 V -1X5 V X3). 



The main claim of the reduction is : 
where |= denotes that \E' is valid. 

For the only if part, we start by observing that no repair contains r. We also 
show that for any consistent instance Ji C {pi,pi, . . . ,pn,pn} there exists a 
repair /' such that I' </ Ji and such that {gi, . . . , qk} C /'. Indeed, consider 
J = Ji U {gi, . . . ,qk, r}. Clearly, it is consistent and since no repair contains 
r, J is not a repair. Consequently, there exists a repair /' such that /' <j J. 
It is easy to see that /' is the required repair. 

Now, we take any valuation of universal variables Vi, construct the consistent 
instance 

-^1 = {Pi I Vii^i) = true,z €{!,.. .,n}}U{pi \ Vi(xi) = false, i G {1, . . . ,n}}, 
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and take the repair I' as described above. Naturally, for every i G {1, . . . ,n} 
either pi or pi belongs to /'. The same holds for ? G {n+ 1, . . . , n + m} because 
r ^ I' and /' is <7-minimal. Consequently, the following valuation of the 
existential variables x^+i, ■ ■ ■ , Xn+m is properly defined 



V2{Xi) 



{true if Pi e 
false if Pie r. 



Vi{xi) 



We claim that Vi U V2 |= Take any clause Cj and observe that at least 
one of ij,i,ij,2,^j,3 is present in I' because r ^ /', qj G /', and /' is a repair. 
Consequently, Vi U V2 assigns true to at least one of Lj^i, Lj^2, -^j,3- 

Now, we show the if part by contradiction: we assume there exists a repair /' 
such that f ^ I' and we construct a consistent instance /" such that /" </ /'. 

First, we observe that f ^ I' implies that r G /'. By </-minimality this gives 
{qi, . . . , qk} C /'. Also, for every i G {1, . . . ,n} either pi or pi belongs to /' 
and for ^ G {n + 1, . . . , n + m} neither pi or pi belongs to /'. 

Consequently, the following valuation of the universal variables Xi, . . . ,Xn is 
properly defined 

{true if Pi G /', 
false if Pi G /'. 

Since 1= there exists a valuation of the existential variables V2 such that 
V — V1UV2 \^ ^. The instance /" is defined as follows: 

I" ^{r, gi, . . . , g/c} U {pi I V{xi) = true, i G {1, . . . , n + m}} U 
{pi I V{xi) ~ false, i G {1, . . . , n + m}}. 

y 1= $ implies that /" is consistent. Note that /' and /" agree on the facts 
. . . ,pn,pn}, both contain {qi, . . . , g^} but /" contains f and some of 
the facts {pn+i,pn+i, ■ ■ ■ ,Pn+m,Pn+m} whereas /' contains none of them. This 
shows that /" is relatively closer to / than /', i.e. /" </ □ 

Corollary 3 There exists a set of 2 FDs and one universal constraint for 
which repair checking is coNP- complete. 

Proof: We use the reduction from the proof of Theorem 6 to reduce Bp to the 

complement of 3SAT: a 3CNF $ is treated as a V*3*QBF with no universally 
quantified variables. Let F be the set of constraints as defined in the previous 
reduction and be the instance obtained from $. We take = {r, gi, . . . , q^} 
and claim that 

$ ^ 3SAT <^ 74 G Repair s{I^,F). 
The proof of this claim is analogous to the proof of Theorem 6. □ 
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7 Related work 



Here we only discuss work relevant to our contributions and we refer the reader 
to surveys of the topic [9,11,14,22]. 

In general, three different approaches to compute consistent query answers 
have been proposed: query rewriting, logic programming, and compact repre- 
sentation of repairs. Our work belongs to the last category. 

Query rewriting was the first approach proposed to compute consistent query 
answers [3]. A query Q is rewritten into a query Q' whose evaluation returns 
the set of consistent query answers to Q. An indisputable advantage of this ap- 
proach is the ease of its incorporation into already existing applications. How- 
ever, applicability of this approach is limited and certain conjunctive queries 
are known not to have rewritings [15,25,41]. 

[3] uses the notion of residues obtained from constraints to identify potential 
impact of integrity violations on the query results. The residues are used to 
construct rewriting rules for the atoms used in the query. This approach has 
been shown to be applicable to quantifier-free conjunctive queries in the pres- 
ence of binary universal constraints. Chomicki and Marcinkowski [15] observe 
that if the set of constraints contains one FD per relation only, the conflict 
graph is a union of disjoint full multipartie graphs. This simple structure 
allows to construct rewriting for simple conjunctive queries, i.e conjunctive 
queries without repeated relation names and no variable sharing. The result 
of Chomicki and Marcinkowski has been further generalized by Fuxman and 
Miller [24,23,25] to allow restricted variable sharing (joins) in the conjunc- 
tive queries. The class C forest of allowed queries is defined using the notion 
of join graph of a query whose vertices are the literals used in the query and 
an edge runs from a literal Ri to literal Rj if there is a variable which oc- 
curs on a non-key attribute of Ri and any attribute of Rj (both occurrences 
have to be different il i = j). The class C forest consist of queries whose join 
graph is a forest, the joins are full and the join conditions are non-key to key. 
Wijsen [41] presents a rewriting scheme for the class of rooted queries which 
further extends C forest- We remark that the class of rooted queries is semanti- 
cally defined and its subclass is captured with syntactic characterization using 
an alternative notion of the join graph. 

Several approaches have been developed to compute consistent query answers 
using logic programs with disjunction and classical negation [4,7,20,26,27,37]. 

Essentially, all of them use disjunctive rules to model the process of repairing 
violations of constraints. In this way stable models of a program corresponds to 
the repairs of the inconsistent database. A query evaluated under the cautious 
semantics returns the answers present in every model, which naturally yields 
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the consistent query answers. 



The main advantage of using this approach is its generahty: typically arbitrary 
first-order (or even Datalog") queries are handled in the presence of universal 
constraints. Also, the repairing programs can be easily evaluated with existing 
logic program environments like Smodels or dlv [19]. We note, however, that 
the systems computing answers to logic programs usually perform grounding, 
which may be cost prohibitive if we are to work with large databases. Another 
disadvantage of this approach is that the class of disjunctive logic programs 
is known to be Ilp-complete. 

These difficulties are addressed in the INFOMIX system [20] with several op- 
timizations geared toward effective execution of repairing programs. One is lo- 
calization of conflicts with identification of the affected database which consists 
of all facts involved in constraint violations and all syntactically propagated 
conflict-bound facts (analogous to applying Tp). Another optimization involves 
using bit- vectors to encode fact membership to each repair and subsequent use 
of bitwise aggregate function to find tuples that present in every repair. This 
optimization, however, may be insufficient to handle databases with large num- 
bers of conflicts because typically the number of repairs is exponential in the 
number of conflicts. Recently, this deflciency has been addressed with repair 
factorization [21]. Essentially, the affected database is decomposed into parts 
that are conflict-disjoint (no two mutually conflicting facts are in separate 
parts). When computing consistent answers to a query only parts that are 
simultaneously spanned by the query are considered at a time. We observe 
an analogy to computing consistent query answers using hypergraphs: when 
flnding whether true is the consistent answer to a ground query Algorithm 3 
analyzes base fragments of repairs obtained by combining the hyperedges ad- 
jacent to facts from the query. 

Our work was inspired by positive results for denial constraints presented 
in [15]. There, the repairs are obtained by deleting facts only and consequently 
the repairs are subsets of the original instance. [15] also investigates using 
subset repairs obtained to dcflnc consistent query answers in the presence of 
inclusion dependencies (IND), i.e. formulas of the form 

yXi3x3. R{xi) P{X2,X3), 

where X2 C xi. An IND of this form is commonly written R[X] C P[y], 
where X and Y are the sets of attributes corresponding to X2 in R and P 
respectively. An IND R[X] C P[Y] is a foreign-key dependency if Y is the 
key of P. We note that universal constraints capture only full INDs, i.e. INDs 
with no existentially quantifled variables. [15] shows that consistent query 
answering is in PTIME for quant ifler- free and simple conjunctive queries in 
the presence of foreign-key dependencies and one key dependency per relation. 
It is also shown that relaxing the restriction on the set of integrity constraints 
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leads to intractability and consistent query answering for arbitrary sets of 
INDs and FDs becomes Ila-complete. 

Using subset repairs is natural in scenarios like data warehousing, where the 
data is complete but may be incorrect. In particular we can assume that if 
a fact is not present in the original database, then it is not true. Obtaining 
repairs by deletion of facts only is not necessarily a natural approach in the 
scenarios where we cannot assume that information missing in the database 
is false, for instance in the context of integration of sources that may be 
missing some information. Then, we might want to consider standard repairs 
obtained by deleting and inserting a minimal set of facts, i.e. repairs in the 
sense of Definition 4. We observe, however, that while in the case of universal 
constraints the missing facts that create conflicts are implicitly deflned, the 
presence of existentially quantified variables in INDs leads to possibly infinite 
number of repairs. 

Cali et al. in [13] show that consistent query answering becomes undecidable 
for arbitrary sets of INDs and FDs. The problem becomes decidable when 
the set of integrity constraints is restricted to non-key- conflicting INDs; IND 
R[X\ C P\Y\ is non-key-confiicting if Y is not a strict superset of the key of 
P. Then, the problem of consistent query answering is Hp-complete. 

Another compact representation of all repairs is nucleus [39,40]. In this ap- 
proach all repairs are represented by a tableau (a table with free variables), 
and queries are evaluated in the standard way (answers with variables are dis- 
carded). We note that for some classes of constraints, constructing the nucleus 
may take an exponential time to complete. 

[2] provides an thorough study of the complexity of repair checking for 4 dif- 
ferent notions of minimality used to define repairs: minimality of symmetric 
set difference (Definition 4), minimality of asymmetric set difference (which 
yields subset repairs), minimality of the cardinality of symmetric set difference, 
and minimality of the cardinality of symmetric difference on every relation. 
The classes of the considered integrity constraints include denial constraints, 
inclusion dependencies, equality-generating dependencies, and weakly acyclic 
and local-as-view (LAV) tuple-generating dependencies. The results offer addi- 
tional insight into the problem of repair checking in the presence of full TGDs 
for the symmetric and asymmetric set difference notion of minimality. The 
authors show that for full TGDs repair checking is PTIME-hard, which in 
view of Theorem 1 makes the problem PTIME-complete. It is a general belief, 
based on the assumption NC C PTIME, that there do not exist fast parallel 
algorithms for PTIME-complete problems. This suggest that database repair- 
ing (Algorithm 2) using a parallel computation model does not guarantee an 
efficiency improvement. However, the authors also show that for weakly acyclic 
LAV tuple-generating dependencies the problem is in LOGSPACE (which is 
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included in NC). On the other hand, repair checking easily becomes coNP- 
complete if we relax the restrictions on the set of constraints; for instance, 
if we consider weakly acyclic TGDs without the LAV restriction. We remark 
that these results are orthogonal to Theorem 1 as the classes of constraints 
are incomparable. 



8 Conclusions and future work 

In this paper we investigated the complexity of computing consistent query 
answers in the presence of universal constraints. We proposed an extended 
version of the conflict hypergraph. Its size is polynomial in the size of the 
database and it captures all repairs w.r.t. to the given set of universal con- 
straints. Hence, we consider it to be a compact representation of all repairs. 
This property is essential for using the extended conflict hypergraph to com- 
pute consistent query answers. 

Extending the notions of conflicts to include negations of facts leads, how- 
ever, to a significant increase of computational complexity. Consistent query 
answering is Hp-complete in the presence of universal constraints even when 
using atomic queries. The problem becomes coNP-complete when we restrict 
the set of constraints to contain full tuple-generating dependencies and de- 
nial constraints only; then the conflicts can contain at most one negation of 
a fact. If we further restrict the integrity constraints to join dependencies, 
denial constraints, and acyclic full tuple-generating dependencies, then the 
problem of consistent answering becomes tractable for quantifier-free queries. 
Consequently, we present an extension of the algorithm of Chomicki and 
Marcinkowski [15] that finds if true is the consistent answer to a closed 
quantifier-free query. 

The problem of repair checking is also intractable for universal constraints. 

It becomes tractable if we restrict the constraints to full tuple-generating de- 
pendencies and denial constraints. Consequently, we present a polynomial re- 
pairing algorithm. It is both sound (always produces a repair) and complete 
(every repair can be produced). 

The summary of computational complexity results is presented in Table 1; its 
last row is taken from [15]. 

We envision several possible directions of future study. First, we would like 
to investigate practical applicability of our approach. The main obstacle lays 
in the high degree of the polynomials used to bound the number of of all 
supports and blocks, and consequently in the high degree of the polynomial 
describing the complexity of Algorithm 3. We observe that the bounds are 
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Summary of complexity results for universal constraints. 

estimates of the pessimistic case where the number of conflicts (ground rules) 
in the database is very high and the set of integrity constraints very complex. 
We believe that in practical scenarios the amount of conflicts is small enough 
to be stored in the main memory and the acyclic height of the set of integrity 
constraints rather small. 

Although computing consistent answers to arbitrary conjunctive queries is 
long known to be intractable [15], considerable effort has been made to find 
practical subclasses of conjunctive queries for which consistent answering is 
tractable [41,25,28]. Usually, tractability comes at the price of restricting the 
class of constraints to primary key constraints. However, it would be inter- 
esting to see for what subclasses of universal constraints similar techniques 
could be used to handle conjunctive queries. Another interesting challenge 
in this direction is a generalization of Algorithm 3 to handle sets of univer- 
sal constraints and arbitrary queries with quantifiers. Because of the negative 
complexity results, we cannot expect that a generalized algorithm would work 
in polynomial time (unless P=NP). We believe, however, that in most prac- 
tical cases such an algorithm should not require exponential time. This belief 
is based on the promising results of heuristics used to optimize the INFOMIX 
system [20,21] and its conceptual closeness to Algorithm 3 (see Section 7). 

It would be interesting to see if using an alternative definition of repairs would 
affect the complexity of consistent query answering and repair checking in the 
presence of universal constraints. For instance, we observe that if we consider 
repairs obtained by deleting facts only, then the repairs are maximal consistent 
subsets of the original instance. It would seem that this property simplifies 
reasoning about repairs allowing to employ algorithms similar to those used 
for denial constraints (where all repairs are obtained by deleting facts only). 
For example, a subset /' of an instance / is a repair of / w.r.t. to a set of denial 
constraints if and only if /' U {R{t)} is inconsistent for any R{t) G / \ /'. This 
is not necessarily true in the case of universal constraints, where we need to 
check that /' U X is inconsistent for every nonempty X C / \ /'. In fact, our 
preliminary research shows that this problem remains coNP-complete. Also, 
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we believe that the positive results carry to the setting of subset repairs as 
well. 
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A Omitted proofs 

Proposition 9 For any set of JDs {jdi, . . . ,jdn} on the same relation there 
exists a JD jd* such that an instance satisfies {jdi, . . . ,jdn} if and only if it 
satisfies jd*. 

Proof: The proof is by induction over n. For n = we note that every 
relation satisfies the trivial JD: R t><l:[attrs{R)], where attrs{R) is the set of all 
attributes of R. The hypothesis is also trivially satisfied for n = 1. To show 
the inductive step it suffices to show that two JDs jdi — Rl><\:[Xi, . . . and 
jd2 = i? lxi:[Fi, . . . , are equivalent to 

jd*^RM:[X,nY,,...,X,nY^,...,Xr,nY,,...,Xr,nY^]. 

To prove this equivalence we use standard relation algebra [1] and recall that 
a JD . . . , Zk] is defined as i? = 7rzi(i?) M . . . N 7rz,(i?). 

First we note that every instance satisfies nz^{R) IXl . . . 1X1 7r^^(i?) C R for any 
sets Zi, . . . , Zk of attributes of R whose union is attrs{R). Hence, it suffices 
to show that 

R C TTx^nvAR) X • --T^x.nYjR) N . . . N 7rx„nyi(i?) N ■ ■■T^Xr.nyjR) 
in any instance / that satisfies jdi and jd2- 

We fix an instance / and let r be the set of all tuples that belong to the relation 
R in /. Take any t er and let tx, = i^x, if) , ty^ = Try. , and txinYj = T^XiCiYj (t) 
for any i e {1, . . . ,n} and any j G {!,..., m}. Since / satisfies jdi and jd2, 
we have that t ^ tx^ ^ ■ ■ ■ ^ tx^, ^ r and i = ty^ N . . . 1X1 iy^ e r. Now, we 
observe that 

tx.nv^ N . . . N tx.nvm ^ • ■ • ^ tx^nv^ M . . . N tx^nY^ = M . . . M tx„. □ 



We recall that the Ihs of a ground rule is represented with a set of facts obtained 
from grounding the atoms of some constraint. If the same fact is obtained by 
grounding more than one atom in the constraint, then it is not repeated in the 
ground rule. We continue to use this representation, but on some occasions 
we will require to know the duplicates in ground JD rules. Then, the Ihs is 
represented with a bag rather than a set, and we call such a rule unfolded. 
Naturally, every rule can be unfolded, although not always unambiguously. We 
remark that this ambiguity does not affect the correctness our considerations, 
and hence we ignore it. 

Example 12 (Unfolded rule) Suppose a schema consisting of one relation 
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name R{A, B, C, D) and let the set of integrity constraints contain one JD 
R \><i:[AB, EC, CD] which is represented with the following formula 

R{x, y, z, s) A R{x', y, z', s') A R{x", y", z', s") ^ R{x, y, z', s"). 

The following ground rule is one of possible instantiations of the formula above. 

R{0, 0, 0, 0) A R{1, 0, 0, 1) ^ R{0, 0, 0, 1). 

Its unfolding is 

R{0, 0, 0, 0) A R{1, 0, 0, 1) A R{1, 0, 0, 1) ^ R{0, 0, 0, 1). 

Lemma 3 If Rules{I , F) contains the following two unfolded ground rules 

r' = R{t[) A ... A R{ti) ^ R{ti) and r" = R{ti) A ... A ^ R{t) 

for some i G {1, . . . , A;}, then there exists j e {1, . . . , A;} such that Rules{I, F) 
contains also 

r* = R{ti) A ... A R{ti-i) A R{t'j) A R{ti+i) A ... A R{tk) ^ R{t). 

Proof: We assume that the rules are obtained from grounding the join de- 
pendency i?Ixi:[Xi, . . . ,Xk\ and recall that it is represented as the following 
full TGD: 

R{xi) A . . . A R{xk) A A XaiXaDX^l^x^lX^nXa]^ R{y), 

l<a,3<k 

where y C XiU . . .U x„ such that y[Xa \ Ui</3<q Xp] = Xa[Xa \ Ui</3<a ^p] 
for a G {1, . . . , A;}. W.l.o.g. we assume that the order of facts in the Ihs of 
rules r' and r" corresponds to the order of the atoms in the definition of the 
constraints. With this assumption a rule -R(si) A ... A R{sk) ^ R{s) belongs 
to Rules{I, F) if and only if 

Sa[XQ,] = s[Xa] for every a G {1, . . . ,k} (A.l) 

s^[X^nXp]= Sf^iX^nXf^] forevery a,/3G {1,...,A;}. (A.2) 

Also, with the assumption on the order in which the facts are listed in rules, 
we show that the claim of the lemma holds for j = i. 

t'^Xi] = ti[Xi] = t[Xi\ follows from (A.l) for r" and r' (for a = i). Since 

XiHX^ C Xi, we get t'^lXi D Xp] = ti[Xi fl Xp], and from (A.2) for r' (for 
a = we obtain t[[Xi fl Xp] = tp[Xi fl Xfs] for every P G {1, . . . , k}. The 
remaining equations needed to prove r* follow trivially from (A.l) and (A.2) 
for r". □ 
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Proposition 10 For every I' e Repairs{I, F) and every R{t) e Hull{I,F) 
R{t) el' ^ 3S e Supp{R{t)).S c /'. 



Proof: Wc fix a repair /' and we say that an ^-support S of R{t) e Hull{I, F) 
is proper if S C I'. 

The if part is proved with a simple induction over the depth of the derivation of 
a support. The proof of the only if is based on the following simple idea. A fact 
R{t) e /' \ / is present in the repair /' to satisfy some ground (full TGD) rule. 
We identify this ground rule by considering an inconsistent instance I'\{R{t)}. 
We use this rule to show that R{t) has a proper support. 

Recall that the acyclic height height{R) in F) of a relation name R 

is the maximal length of a directed acyclic path that begins in R. For I G 
{-1, 0, . . . , /i} define ^ {R e S \ height{R) < £} and note that .S'^ = S. 

We show with induction over £ e {—1,0, . . . , h} that for R e 

R{t) el' ^3S e Supp\R{t)).S c 

For £ = — 1 the claim is trivially true because = 0. We prove the inductive 
step by taking the set of all facts in /' that do not have a proper support: 

T = {R{t) e I' \ ReS^ A$S e Supp\R{t)). S c /'}. 

We observe that T C I' \ I because all facts that belong to / have a proper 
support obtained with the rule Sg^. Also, by IH T contains no facts using 
a relation name from S^~^, i.e. T contains only facts using relations names 
whose acychc height is £. 

Now, we show that /' \ T is consistent. As a subset of a consistent instance 
it satisfies all denial constraints. Hence, we only need to check satisfiability of 
ground full TGD rules having in the rhs a relation name of acychc height £. 

First, we take a non-JD ground rule 

Rlih) A . . . A Rn{tn) ^ R{t) 

such that R,{U) e I' \ T for all i G {1, . . . ,n}. Note that every Ri G 5^"^ 
and by IH every Ri{ti) has an {£ — l)-support Si such that Si C /'. Now, 
S = Si U . . . U Sn C /' is an ^-support of R{t) constructed with the rule S^. 
Consequently, R{t) ^ T. Naturally, R{t) G /' because /' is consistent. 
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Now, consider an (unfolded) JD rule 

r* = R{h) A ... A R{tk) ^ R{t) 

such that every R{ti) E I' \ T. Repeatedly we use Lemma 3 and instances of 
Sf, used to obtain proper ^-supports of i?(ij)'s, to show the existence of the 
following elements: 

• a ground rule r = Ritl) A ... A -R(t* ) R{t) (being the last element of a 
constructed sequence of (unfolded) ground rules Tq, . . . , = r), 

• ground rules r; = Rp,i{t*p,i) A ... A Rp^riM,np) ^ ^(^p) every p e 

• proper {i — l)-supports 6"*^ of Rp,q{t*^p) for every p G {1, . . . ,k} and q G 
{1, . . . ,np}. 

For p E {I, . . . , k}, let the ^-support of R{tp) be obtained with the following 
instance of S{: 

r; = R(t[) A ... A i?(4J ^ i?(i,) 

= i?a,l(4,l) A ... A Ra,n^{t'a,mJ ^ ^(4) Va G {1, . . . , kp} 

R{tp)^I Sa,p e Supp'-\Ra,p{t',^^)) Va G {!,..., fc}, V/3G {!,..., m J 

Ua,;3 5«,/3e5«PP^(i?(g) 

We apply Lemma 3 to 

r; = R{t[) A ... A i?(t',J ^ i?(t,) 

and 

r; = R{tl) A ... A /?(t;_J A i?(tp) A R{tp+,) A ... A i?(tfc) ^ R{t) 
to obtain 

r; = R{tl) A ... A i?(t;_i) A i?(t;) A R{tp+i) A ... A i?(tfc) ^ R{t), 

where R{t*) — R{t'j) for some j G {1, . . . , kp} indicated by Lemma 3. Prom 
the instance of sf, for r* we take r^j, and for S*g we take Sj^q for every 
g G {!,..., m^}. 

The elements above show that R{t) has a proper support, i.e. R{t) ^ T. 
Again, R{t) G /' because /' is consistent. This finishes the proof that I'\T is 
consistent. 

Now, recall that T C /' \ /, and therefore /' \ T </ /'. Since /' is <7-minimal 
consistent instance, I' \ T — /', and consequently T — 0. □ 
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Proposition 11 For every I' e Repairs{I, F) and every R{t) e Hull{I,F) 
R(t) ^I' ^ 3{B, N) e Block{R{t)). B C I' AN nl' = 0. 

Proof: Wc fix a repair /' and we say that an £-block [B, N) of R{t) G 
Hull{I, F) is proper ii B C I' and N nl' = 0. 

The if part is proved with a simple induction over the depth of derivation of 
a block. The proof of the only if part, although technically complex, is based 
on the following simple idea. R{t) e / \ /' is absent in the repair /' because 
its presence would cause a violation of some ground rule. We identify this rule 
by considering an inconsistent instance /' U {R{t)}. We use this rule to show 
that R{t) has a proper block. 

Recall that the acyclic depth depth{R) in V{S, F) of a relation name R 
is the maximal length of a directed acychc path that ends in R. For i e 
{-1, 0,...,h} we define ^ {R e S \ depth{R) < i}. Note that 5'^ = 5 as 
the acyclic depth of every relation name is bounded by the acyclic height of 
V{S,F). 

We show with an induction over £ that for R e 

R{t) ^ 3{B, N) e Block\R{t)). B C I' ANni' ^ 0. 

For £ — —1 the claim is trivially true because — 0. We prove the inductive 
step by contradiction: we assume that the set of facts that are not in /' and 
that do not have a proper £-block, 

T = {R{t) e Hull{I, F)\I' \ ReS^ A R{t) has no proper ^-block} 

endalign* is nonempty. We note that T C I \ I' because any fact R{t) £ 
Hull{I, F)\I has a proper block obtained with the rule Bq ^. 

Now, take any element R{t) e T, let jdu E F he the JD on relation i?, and 
consider the set V = T^jdR}{I' U {R{t)}) \ I'. For any R{t') e V hj rR(^t') we 
denote an arbitrarily chosen ground JD rule R{t) A R{ti) A . . . R{tk) R{t') 
such that every R{ti) e I'. 

We claim that: (1) I'UV is consistent, and (2) V CT. 

(1) Because I' UV is obtained by adding facts using the relation name R to 
a consistent instance, we only need to verify that ground rules having R 
in their Ihs are satisfied. 
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First, take an (unfolded) ground JD rule 



r" = R{t[) A ... A R{tp) A R{h) A ... A R{tk) ^ R{s) 

such that every R{t'j) belongs to V and every R{ti) belongs to /'. We 
iteratively apply Lemma 3 to the rule above using rules rji^f/,) and obtain 
the rule 

R{tl) A ... A R{tl) A R{ti) A ... A R{tk) ^ R{s), 

where each R{t*) is either R{t) or belongs to /'. If all i?(t*)'s belong to 
/', then R{s) e /' because /' is consistent. Otherwise, R{s) e V. 

For the remaining (non-JD) ground rules, we show that if such a rule is 
not satisfied in /' U V, then a proper ^-block for R{t) can be constructed 
(which contradicts R{t) e T). If there is a ground denial rule 

R{t[) A ... A R{Q A Ri{si) A ... A Rm{sm) ^ false 

such that every R{t'j) G V and every Rj{sj) G /', then a proper ^-block of 
R{t) is constructed with the rule B5~^. Similarly, if there is a ground rule 

R{t[) A ... A R{Q A Ri{si) A ... A i?^(s J ^ P{s) 

such that every i?(t-) G V, every i?j(sj) G I', and P(s) ^ /' U F, then by 
IH P(s) has a proper (£ — l)-block and a proper £-block is constructed 
with the rule Bg. 

(2) For R{t') G we show that if R{t') has a proper £-block, then R{t) has 
a proper ^-block as well (which contradicts R{t) G T). 

Suppose some R{t') G V has a proper £-block constructed with the 
following instance of the rule 

r = P(t;) A ... A RiQ A i?i(si) A ... A R^Sm) ^ P(-s) 
r, = R{t') A A ... A P(t:,,J ^ i?(tO G {1, . . . ,n} 

G Supp{R{t',^^)) Wie{l,...,n},Wj e{l,...,h} 
Sp G Supp{Rp{tp)) Vp G {1, . . . , m} 
i?(i') G / (5, A^) G Block^-\P{s)) 

(U,, 5",,, UUpSpU B, N) G Bloc¥{R{t')) 

For every i G {1, . . . , n} wc apply Lemma 3 to r' = TRfj^r^ and r" — Tj, 
and obtain r*. ^ We observe that for every i G {1, . . . , n} if the ground 
rule r* does not have R{t) in its Ihs, then all the facts in its Ihs belong 
to /' and consequently have a proper support. Let X C {1, . . . , n} be the 
set of indexes of all rules r* which have R{t) in their Ihs. Using Bf with 

^ More precisely, we take an unfolded version of r" and apply Lemma 3 to every 
occurrence of R{t') in r" . 
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the ground rules r, r* for i G X, the corresponding proper supports, and 
the proper {£ — l)-block of P{s) we obtain a proper £-block of R{t). 

Finally, we observe that I' LiV <i I' because V C T C I \ I'. However, /' is 
by definition a <7-minimal consistent instance; a contradiction. □ 

Proposition 12 For any fact R{t) the sets Supp{R{t)) and Block{R{t)) can 
be constructed in time polynomial in the size of I. 

Proof: Here we only give a combinatorial argument showing that the number 
of all supports and blocks of a fact is bounded by a polynomial of the size of /. 
A polynomial algorithm that generates the supports and blocks can be easily 
derived. 

By K we denote the maximum number of atoms used in the definition of 
a constraint in F. Since we assume the set of constraints to be fixed, K is a 
constant. Also, note that the acyclic height h of the dependency graph V{S, F) 
is bounded by the cardinality of S. 

Wc recall that every ground rule corresponds to a subset of Hull{I,F) of 
cardinality at most K. Hence, the number of all ground rules is bounded by 
R^ \Hull{I,F)\^+\ 

First, with a simple induction over £ E { — 1, 0, . . . , /i} we show that the number 
of all ^-supports of a fact is bounded by = This bound holds 

trivially for i = —1. To show the inductive step, we note that S{ can be 
instantiated with at most possible combinations of ground rules and 

{£ — l)-supports of K'^ facts. By IH each of the facts has at most R'^^'^^^ 
{i — l)-supports. Together, the number of possible combinations is bounded 

by 

^j^{K+l)^(^+^'>-{2K+l){K+lf''+K+l ^ ^(/^+l)2('?+i) 

Hence the number of all supports of a fact is bounded by A" = 7^(^+1)^'''"'"^' . 

Now, with a simple induction over £ G {—1, 0, .... /i} we show that the num- 
ber of all ^-blocks of a fact is bounded by {R^^^N^ )^+^. For £ = —1 the set 
of supports is constructed either with Bg ^ or ^. For the former the claim 
holds trivially. For the latter we observe that there are at most combi- 
nations of ground rules and A^^^ combinations of supports used in BJ"^, giving 
together R^^^N^ . Similarly, for the inductive step we observe that can be 
instantiated with R^^^ combinations of rules, A'^ combinations of supports, 
and, from IH, at most {R^^^N^ )^+^ {£ — l)-blocks. Together, this gives us 
exactly P = {R^'+^N'' 
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Finally, we observe that both A'^ and P are polynomials when viewed as func- 
tions of the size of /. □ 
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